Quantcast
Channel: Ivanovo

Migrating Tokio and Futures – What to Look for

0
0

I’ve finally found courage to look after migrating audioserve to new hyper and tokio and futures crates. As new futures (0.3) are significantly different from previous version, mainly due to support of async/await keywords, I was expecting significant effort, so I was kind of delaying it. Now when it’s almost done, I’d like to reflect a bit on the effort. Firstly – it was not as bad as expected (although after updating cargo dependencies I’ve got about a hundred of errors). Secondly – there are some patterns to follow in migration, which I’ve like to describe further in the article.

Future traits

Most notable change happened in Future trait, associated type and method signature are different, so probably this is a first thing to start with – wherever you use Future or Stream in types you should rewrite them. Old future had two associated types (one for successful result, one for error), new future has just one (which make probably more sense, because not all futures have to resolve to error too, one can have future that just always resolves to success value). So obvious replacement in migration is to use Result as future Output type – so you will keep you existing semantic of future results.

Also the content of Future trait change dramatically – now it contains only basic poll method and combinator methods moved to new traits FutureExt and TryFutureExt (later contains methods related to futures resolving into Result – like and_then map_err and others). Also IntoFuture trait is gone – so some combinators like and_then or or_else must return result future – not just result itself.
Similar situation is for Stream ( with additional traits StreamExt and TryStreamExt). Bit different situation is with Sink, where Item moved from associated type to generic type parameter (all combinators are in SinkExt trait). To use combinator method appropriate trait has to be brought into context. One can import use futures::prelude::* to get all common traits imported.

Implementing Future

If your code has types implementing Future, you will have more work now. This is due to change in poll method signature, Future looks now:

pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output>;
}

Key change is pinning self. It’s done to assure that self is not moved in memory in cases when it’s self-referencing some of it’s own parts. This was introduced especially for async/await blocks, where self- referencing is common (short explanation of Pin is here).
So Future implementation has to be rewritten, if your future is Unpin then actually implementation is quite the same, as Pin implements DerefMut for Unpin targets. So you work with self in quite same way. As Unpin is common for majority of types I was able to make my custom futures and streams Unpin too.

Other aspect is when your future is polling another future, which is very common, now we have to wrap reference into Pin, again for Unpin types it’s trivial- Pin::new, for other types which are not Unpin you can fix them in memory – easiest way is to allocate on heap and pin – Box::pin, which will make resulting type Pin<Box<T>>, which is Unpin. If your future stores child futures, stored them pinned, it’ll be most convenient.

Future Trait Object

I hit interesting problem when migrating Future trait objects, initially I used same approach as in old code, which is illustrated in code below:

use futures::{future::{ready}, prelude::*};// 0.3.4
use tokio; // 0.2.11

#[tokio::main]
async fn main() {
    let f: Box<dyn Future<Output=u32>> = Box::new(ready(1));
    
    let f = f.map(|v| {
    println!("Got value {}", v);
    v+1
    }
    );
    
    let x = f.await;
    println!("DONE {:?}", x);
    
}

But I’ve got this strange error:

 error: the `map` method cannot be invoked on a trait object  
--> src/main.rs:8:15  
  | 
8 |     let f = f.map(|v| { 
  |               ^^^   | help: another candidate was found in the following trait, perhaps add a `use` for it: 
  |
1 | use futures_util::future::future::FutureExt;  
  | 

FutureExt trait is for sure imported from futures::prelude, so what’s going on?

It turns out that problem is again related to UnpinFuture trait is only implemented for Box<dyn Future + Unpin> – and FutureExt requires Future – so that why our type cannot use map method. Solution is rather simple – either add Unpin trait or even better use Pin<Box which does implement Future as illustrated below:

use futures::{future::{ready}, prelude::*};// 0.3.4
use tokio; // 0.2.11
use std::pin::Pin;

#[tokio::main]
async fn main() {
    let f: Pin<Box<dyn Future<Output=u32>>> = Box::pin(ready(1));
    
    let f = f.map(|v| {
    println!("Got value {}", v);
    v+1
    }
    );
    
    let x = f.await;
    println!("DONE {:?}", x);
    
}

async and await

async and await keywords are really great improvement for writing asynchronous code. It does simplify code notably and it’s a lot easier to write it (I already mentioned it in this article).

I did not covert all code to async/await (partially due to laziness, partially just did not want to touch things that worked well and does make sense). However some parts – like asynchronous file access with tokio::fs required rewrite – while in past file object was owned by future ( resulting from operation like read) and moved to future result , now mutable reference shared with future is used so chaining futures was more difficult – but quite obvious when rewritten in async block.

Recap

  1. Use futures::prelude::* to import all useful traits, including Ext traits
  2. Update references to Future and Steam traits- new ones should have one associated type Result, which contains success and error type of old future.
  3. Update your implementations of futures, streams, etc. Try to make them Unpin – it will make you life easier. If they have inner futures, store them pinned. Box::pin is you friend to make future Unpin (but requires allocation).
  4. Use Box::pin for you Future (or Stream) trait objects
  5. Rewrite code to async/await where necessary: too complex combinators, new futures require mutable reference etc.

Hyper and TLS

0
0

For audioserve update I wanted use new hyper with tokio-tls. Actually it required a bit of documentation reading, but finally turned our quite straightforward.

Here are supportive functions (in tls module):

use futures::{future, stream::{StreamExt, TryStreamExt}};
use hyper::server::accept::{from_stream, Accept};
use native_tls::Identity;
use std::fs::File;
use std::io::{self, Read};
use std::path::{Path, PathBuf};
use tokio::net::{TcpListener, TcpStream};
use tokio_tls::TlsStream;

type Error = Box<dyn std::error::Error + 'static>;

pub struct TlsConfig {
    key_file: PathBuf,
    key_password: String
}

impl TlsConfig {
    pub fn new<P:Into<PathBuf>, S:Into<String>>(key_file: P, pass: S) -> Self {
        TlsConfig{
            key_file: key_file.into(),
            key_password: pass.into()
        }
    }
}

fn load_private_key<P>(file: P, pass: &str) -> Result<Identity, Error>
where
    P: AsRef<Path>,
{
    let mut bytes = vec![];
    let mut f = File::open(file)?;
    f.read_to_end(&mut bytes)?;
    let key = Identity::from_pkcs12(&bytes, pass)?;
    Ok(key)
}

pub async fn tls_acceptor(
    addr: &std::net::SocketAddr,
    ssl: &TlsConfig,
) -> Result<impl Accept<Conn = TlsStream<TcpStream>, Error = io::Error>, Error> {
    let private_key = load_private_key(&ssl.key_file, &ssl.key_password)?;
    let tls_cx = native_tls::TlsAcceptor::builder(private_key).build()?;
    let tls_cx = tokio_tls::TlsAcceptor::from(tls_cx);
    let stream = TcpListener::bind(addr).await?.and_then(move |s| {
        let acceptor = tls_cx.clone();
        async move {
            let conn = acceptor.accept(s).await;
            conn.map_err(|e| {
                error!("Error when accepting TLS connection {}", e);
                io::Error::new(io::ErrorKind::Other, e)
            })
        }
    })
    .filter(|i| future::ready(i.is_ok())); // Need to filter out errors as they will stop server to accept connections

    Ok(from_stream(stream))
}

Main trick is to filter out TLS accept errors – because an error in stream will stop the stream and new connections will not be accepted then. (And TLS errors are happening often – you may connect with just plain HTTP or client does not trust certificate, etc.).

With above function server can be started like this:

#[macro_use]
extern crate log;
use std::convert::Infallible;
use hyper::{Request, Response, Body, Server, service::{make_service_fn, service_fn}};
use tls::{TlsConfig, tls_acceptor};

mod tls;

type Error = Box<dyn std::error::Error + 'static>;

#[tokio::main]
pub async fn main() -> Result<(), Error> {
     env_logger::init();

    let addr = ([127, 0, 0, 1], 3000).into();
    let tls_config = TlsConfig::new("/home/ivan/tls_key/audioserve.p12", "mypass");
    let incoming = tls_acceptor(&addr,&tls_config).await?;

    let make_svc = make_service_fn(|conn: &tokio_tls::TlsStream<tokio::net::TcpStream>| {
        let peer = conn.get_ref().peer_addr().unwrap();
        async move { Ok::<_, Infallible>(service_fn(move|req| {
            debug!("Request from {} for path {}", peer, req.uri());
            futures::future::ok::<_, Infallible>(Response::new(Body::from("Hello World!")))
        })) }
    });

    let server = Server::builder(incoming).serve(make_svc);
    info!("Listening on https://{}", addr);
    server.await?;
    Ok(())
}

Clone or Reference Count – Which One Is Faster

0
0

While I was updating audioserve I hit one part of code, when I was passing a not so big structure (containing data for authentication) to asynchronous task (in tokio multi-threaded executor), as task might run potentially in different thread it has to have static lifetime. Easiest solution was to clone the structure and move it to thread (which was original solution). But during refactoring I realized that reference count – Arc type could be better solution – it can save a small piece of memory (Arc is 8 bytes), but also it could perform better (or could not?). To check my later assumption I’ve run couple of of tests.

Here is code for the plain structure and its reference counted wrapper:

use std::sync::Arc;

#[derive(Debug, Clone)]
pub struct Secrets {
    shared_secret: String,
    server_secret: Vec<u8>,
    token_validity_hours: u32,
}

impl Secrets {
    pub fn sample() -> Self {
        Secrets {
            shared_secret: "kulisak_jede".into(),
            server_secret: b"01234567890123456789012345678901".to_vec(),
            token_validity_hours: 24*100
        }
    }
}

#[derive(Clone, Debug)]
pub struct SharedSecrets{
    inner: Arc<Secrets>
}

impl SharedSecrets {
    pub fn sample() -> Self {
        SharedSecrets{
            inner: Arc::new(Secrets::sample())
        }
    }
}

Now let’s try simplest benchmark – pass physical clone (copying memory) or pass reference counted smart pointer. Former is called uninventively test_cloned latter confusingly test_shared:

#[bench]
fn test_cloned(b: &mut test::Bencher) {
    b.iter(|| {
        let s = Secrets::sample();
        for _i in 0..LOOPS {
            let c  = s.clone();
        }
    })
}

#[bench]
fn test_shared(b: &mut test::Bencher) {
    b.iter(|| {
        let s = SharedSecrets::sample();
        for i in 0..LOOPS {
            let c  = s.clone();
        }
    })
}

Now to benchmark results for 1M loops:

 test test_cloned          … bench:  67,909,930 ns/iter (+/- 1,036,866)
 test test_shared          … bench:  12,129,636 ns/iter (+/- 334,998) 

OK, that was expected right?. Arc is faster then copying memory, even when struct size is rather small – ~ 100 bytes. But question is – how much is this just a toy benchmark I’ve tested also on other newer machine and I saw same trend there , but actual difference was much smaller( about half – probably due to faster memory?).

But that was not the way how you normally pass value – main use case for Arc is pass reference to threads. So how does it look like for this scenario:

 #[bench]
fn test_threaded_cloned(b: &mut test::Bencher) {
    b.iter(|| {
        let s = Secrets::sample();
        let mut threads = vec![];
        for i in 0..THREADS {
            let c  = s.clone();
            threads.push(thread::spawn(move || {
                let x = c;
            }));

        }
        for t in threads {
            t.join().unwrap();
        }
    })
}

#[bench]
fn test_threaded_shared(b: &mut test::Bencher) {
    b.iter(|| {
        let s = SharedSecrets::sample();
        let mut threads = vec![];
        for i in 0..THREADS {
            let c  = s.clone();
            threads.push(thread::spawn(move || {
                let x = c;
            }));

        }
        for t in threads {
            t.join().unwrap();
        }
    })
}

And results for 1000 threads (we cannot normally run 1M threads, unless modifying kernel parameters – remember 10K connections problem? ). As threads have significant overhead, 1000 of them will be enough for our another toy benchmark:

 test test_threaded_cloned … bench:  29,858,099 ns/iter (+/- 415,199)
 test test_threaded_shared … bench:  29,651,799 ns/iter (+/- 345,890)

So as you can see threads overhead completely hides difference.

But what about tokio and threadpool – the problem of clone or reference count difference initially arisen when I was passing values to tokio tasks. So lets try something like this:

#[bench]
fn test_tokio_shared(b: &mut test::Bencher) {
    let mut rt = Builder::new()
        .threaded_scheduler()
        .build().unwrap();
    b.iter(|| {
        let s = SharedSecrets::sample();
        let mut threads = vec![];
        for _i in 0..TASKS {
            let c  = s.clone();
            threads.push(rt.spawn(async move {
                let x = c;
            }));

        }
        rt.block_on( async {
            for t in threads {
                t.await.unwrap();
            }
        })
        
    })
}

#[bench]
fn test_tokio_cloned(b: &mut test::Bencher) {
    let mut rt = Builder::new()
        .threaded_scheduler()
        .build().unwrap();
    b.iter(|| {
        let s = Secrets::sample();
        let mut threads = vec![];
        for _i in 0..TASKS {
            let c  = s.clone();
            threads.push(rt.spawn(async move {
                let x = c;
            }));

        }
        rt.block_on( async {
            for t in threads {
                t.await.unwrap();
            }
        })
        
    })
}

And run 10 000 of tokio tasks on a threadpool:

 test test_tokio_cloned    … bench:   6,174,117 ns/iter (+/- 297,534)
 test test_tokio_shared    … bench:   5,730,147 ns/iter (+/- 275,639)

Some small difference is visible in favor of Arc (based on previous results it was expected).

Finally let’s do some baseline measurement – what if we do not copy memory (test_word_cloned), but just value on stack, or pass Arc reference (test_word_shared), or pass Rc reference (test_word_shared_rc). I had to modify code to add bit more work, otherwise first case was so optimized that it was not executed at all (probably as duration was 0 ns).

#[bench]
fn test_word_cloned(b: &mut test::Bencher) {
    b.iter(|| {
        let s = 24u64;
        let mut acc = 0;
        for _i in 0..LOOPS {
            let c  = s.clone();
            acc+= c;
        }
        println!("{}", acc);

    })
}

#[bench]
fn test_word_shared(b: &mut test::Bencher) {
    b.iter(|| {
        let s = std::sync::Arc::new(24u64);
        let mut acc = 0;
        for _i in 0..LOOPS {
            let c  = *s.clone();
            acc+=c;
        }
        println!("{}", acc);

    })
}

#[bench]
fn test_word_shared_rc(b: &mut test::Bencher) {
    b.iter(|| {
        let s = std::rc::Rc::new(24u64);
        let mut acc = 0;
        for _i in 0..LOOPS {
            let c  = *s.clone();
            acc+=c;
        }
        println!("{}", acc);

    })
}

And results for 1M of loops:

 test test_word_cloned     … bench:         112 ns/iter (+/- 5)
 test test_word_shared     … bench:  12,131,404 ns/iter (+/- 338,120)
 test test_word_shared_rc  … bench:      10,790 ns/iter (+/- 466)

Reference counting is very significantly more expensive, then just copying trivial value (u64). And Arc is significantly more expensive to Rc.

Conclusions

So it looks like Arc could be little bit faster for my use case, so I use it now in my code instead of cloning the structure.

Can Hashmap Be An Obstacle on Road to Performance

0
0

I’m still playing with small exercises, one interesting is “alphametics” (it’s available on exercism). Exercise goal is to evaluate if an expression like “SEND + MORE == MONEY”, when letters are replaced uniquely with numbers, is valid equation (more details in link above). As I have written in previous article, I try to think about performance and this exercise was quite interesting from that perspective.

My initial solution was based on brute force approach, searching for all possible assignments of numbers to present letters – it did work, but it was slower then “best” solution, which was searching for solution per columns – e.g. first find solutions for “D+E == Y” (with possible carry over), then test them with next columns – this should reduce search space. So I modified code to similar approach (although with bit more explicit code) – the code is available on github – branch normal-hash (for the competing solution see src/other.rs).

I did benchmark comparing to the other solution , but still my code was slower (unfortunately I have to do benchmarks in VM (Virtual Box), where variance is much higher then on regular linux installation, but mean value should be OK, as I have checked in previous benchmarks, bench is running hundreds of iterations):

 test mine  … bench: 1,673,078,653 ns/iter (+/- 1,223,832,876)
 test other … bench: 497,544,602 ns/iter (+/- 121,667,843)

As you can see mine is about 3 times slower (not looking much at variance). So when something is slower then required it’s time for performance profiling. In past I used valgrind callgrind, but for this case I decided to use perf – especially because I found nice article how to use it for creating flame graphs, which is cool way how to visualize call graph and how individual functions consume CPU time. So join me on journey to better performance.

Here are commands to record call graph and then create a nice flame chart (install perf FlameGraph tools – create links to appropriate scripts to ~/.local/bin so we can use these scripts easily):

sudo perf record -g --call-graph dwarf target/debug/examples/ten; sudo chmod a+r perf.data
perf script | stackcollapse-perf | flamegraph > flame0.svg

And below is the result – flame0.svg picture (picture is actually interactive, but, you have to open it in separate window by right clicking on it and choosing View Image):

Flame Graph for solution with default hash function

I really do like these flame graphs, because they give you nice overview of CPU time allocation over whole program in one picture, which you can easily see on screen. Here we can see that majority of work is divided between Combinator (later renamed as Permutator) iterator (which generates all possible permutations of numbers assignment to letters )and Column::evaluate, which checks if given permutation is valid (summands adds to correct value).

Hashmaps are omnipresent data containers and some languages like Python are build around them (class is hashmap, instance is hashmap) and they are also intensively used in this exercise. If we look little bit higher in flash graph, into those “hills”, we see that majority of time in spent in hashmap operations.

So this brings us back to the title of this article – significant part of time is spent in hashmap – it looks like here hashmap is really an obstacle to reaching better performance. So how to fix this – Rust is known to have rather complex default hash function, which prevents against DOS attacks, but we probably do not need it here. Let’s replace it (as hash functions are pluggable) with something faster from crate fasthash (quite indicative name, right, we do want fast hash here) and it is seahash, which advertise itself as “A bizarrely fast hash function”. Modified exercise code is here. And here are benchmark results:

 test mine  … bench: 1,310,181,847 ns/iter (+/- 279,684,550)
 test other … bench:    414,885,580 ns/iter (+/- 23,982,803)

Not much difference, actually (barely on the edge of results variance – but as I said variance is little bit higher, as tested in virtual machine, when sometimes VM just sits and waits, so few cases get unusually high timing). Let also look at flame graph:

Flame Graph with faster hasher

It’s looks kind of similar to previous case, right? But let’s look at details – part where hashmap is looking up for values:

Comparing default hasher (SIP) to SeaHash

Areas highlighted by blue lasso are parts of code, where I think actual hash calculation is done. So we can see that Seahash is faster, but it’s not enough to make significant difference. Some more radical changes are needed. If we look at above flame graphs, we see that there is still quite some overhead related to hashmap machinery. As we are mapping char to byte and chars are ASCII only we actually are mapping byte to byte – so we can represent this as 256 bytes long array – index is a key and array cell a value. So let’s try it – code is here and array is actually implemented as Vec and it’s wrapped in simple map and set interfaces, so it fits well into existing code. So does this version perform:

 test mine  … bench: 158,229,699 ns/iter (+/- 46,468,825)
 test other … bench: 411,004,920 ns/iter (+/- 68,749,867)

Voila – we are there, finally, mine version is faster and we can go out for beer (or we could if there would not be fu….g corona virus out there, pubs are closed:-). Here is flame graph for completeness:

Flame Graph for final solution with minimalist map

As you can see (right click, View image to explore), cpu time is spent now elsewhere. Some part is relatedto vector creation – so what if we use fixed arrays rather then vector? Array is simpler, sits on stack (in this case), so it should faster, right? But actually it’s quite opposite, just by changing Vec<u8> to [u8;256] (in fastmap::FastMap ) we make code about 2x slower (sorry do not have measurements), so it brings us finally to to first commandment of performance engineering – always measure!

What happened to audioserve in past year

0
0

Not much, but definitely it’s not dead. Actually I’m quite happy with current functionality and do not need more (actually there are couple of things that would be nice to have like shared bookmarks and read/finished audiobook attribute, but these will probably require to go beyond it’s simplistic design) and I’m using it every day. But some coding was done during past year and some small improvements were implemented.

Main changes on server Rust code were not focused on features but on updating to latest dependent libraries. Asynchronous world in Rust is rushing up, so it was necessary to catch up. Actually I have done two major updates – first at beginning of 2020 to use new futures and tokio 0.2 (plus hyper 0.13) and recently to update to tokio 1.0 (plus hyper 0.14). So audioserve is now stuffed with latest and greatest asynchronous code available in whole galaxy 🙂 This effort not only updated dependencies but in many places I fixed incorrect or sub-optimal code – so hopefully code is now bit better (especially error handling was moved to anyhow/thiserror and made more consistent).

Concerning features, just few were added:

  • possibility to set base-dir – for cases when you are behind reverse proxy and need to add some root path before audioserve our path.
  • Enhanced logging of authentication and access failures – with remote ip (even if audioserve is behind reverse proxy)
  • Build in web client was enhanced with possibility to use space and arrow keys to control playback and login dialog is now correctly indicating failures.

On Android client I started with dark theme, I do have some version in separate branch, running it on my mobile, but it still need some more changes. Cannot find strength to finish it, Android development sucks.

For code and changes check audioserve on github.

The post What happened to audioserve in past year first appeared on Ivanovo.

Are you dead?

0
0

Not yet. However I did not publish much in last 9 month or so due to several reasons- and covid was not the least significant one. I started couple of interesting things, but was lazy and demotivated to sum them up even in small article. I also changed my job few months back, which kept me quite busy, but positive impact was I’m waking up from lethargy caused by covid and past job.

So I’ll try to quickly summarize past year or so in here, it’ll be nothing special, just to have things accounted:

Audioserve

Audioserve is still my favorite hobby project. I’m using it every day to listen to audiobooks and I’m mostly happy with it’s functionality (and friend of mine voluntarily migrated from iOS to Android to be able to use it 🙂

Until recently there has not been much progress, in released versions there are just couple of new things:

  • Android client – finally there is “dark theme” working. I’m using it and more or less happy with it (especially on OLED displays dark themes make sense). However I think Android client (as it’s written now) reaches it zenith. I was not able to update some dependencies to newer versions, because they will break the code. I think it’ll need complete rewrite, but I do not have time neither will to do it now. So I hope it’ll survive future versions of Android.
    It still has many positive things, mainly the aggressive caching, but it sucks on UX side – user interface will need complete rewamp.
  • On server side I just fixed couple of bugs mainly focused on security.
  • Most interesting work on server was adding rate liming of incoming request inspired by Leaky Bucket algorithm (used for instance in nginx] and implemented it in this crate.
  • Also interesting was adding async zip folder download (this crate)
  • Web client received also couple of bug fixes plus support for playback speed button.

But recently I’ve got new impulse and quite a lot of new functionality is prepared in master branch, I’ll just need bit more time to test, document, polish and release. Here are the things you can look forward:

  • New collections cache, based on sled – fast Rust native key-value store. It enables fast response for folders with many audio files and it is enabler for other cool functionality below ( it sounds as simple stuff but it was major effort in refactoring audioserve to use this new cache).
  • More reliable playback positions tracking.
  • Faster propagation of collection directory changes (add, delete, rename) into cache – especially useful for search.
  • Backup of playback positions – you will never loose your playback positions if you back them up regularly.
  • Marking folders as read/finished – this was very much wanted functionality. Just listen to folder to it’s end and it is automatically marked as finished (unless you start to listen to it again, then it’s unmarked). (Web client only for now)
  • Audio files tags metadata are displayed. If you read previous articles you know I’m very skeptical about tags, they are especially messy for audiobooks. But with new cache it’s easy to read them, so we can at least show them, so you see this mess with your own eyes. (Web client now only)
  • Regular API (REST) for playback position. We did have very special custom protocol over WebSocket, which is efficient, but not much convenient. So new API is more easy to use for custom client.

Speaking about clients, you probably noticed that I’m not very good at UX and UI – I try to elude the fact by babbling about simplistic interfaces etc., but sad truth is my client solutions just did not stand to recent standards. So I was quite excited when KodeStar started his audiosilo project for new web client for audioserve. UI looks quite nice, though functionality is still missing, but hopefully it’ll improve over time.

Rust

Apart of audioserve I worked on few minor Rust things:

Java

And since I professionally got again back to Java I’ve created very simple demo for micro services based on Spring Boot and their deployment to Kubernetes / Openshift. Code is not worth to present, but it is still around on my github repo.

Audioserve release and some retro fun

0
0

Recently I released new version of audioserve v0.16.5, which brings some interesting changes. I think I’ve described them in previous article and for sure they are also described in README. This release was focused on two major themes – performance and API. Concerning API – I’ve created small funny page using WebAmp audio player (Web recreation of famous WinAmp). It’s just for demo how audioserve API can be easily used, not everything works reasonably as WinAmp was focused on music. I have been using WinAmp some 20+ years back, at dawn of the past century, so I could not resist to play with it when I saw it.

Above is the screenshot of this page. If you want you can play with it. But do not expect much. If you want real demo of audioserve, try this (shared secret is mypass).

Analyzing Project Popularity on Github

0
0

Wondering how you project is doing on github and stars and growing over the time? I was (for my favorite project audioserve) so I created ipython notebook for to analyze trend on the time, which some predictions (which are now called AI – even linear regression is now called AI :-).

Chart first

And Now Code

You will just need usual data analysis libraries: pip3 install jupyter pandas numpy scipy matplotlib.


Social Media Influence on Project Popularity

What’s new in audioserve

0
0

Unfortunately I did not have much time to update my blog. But I’m not dead and neither audioserve is – actually last half of year or so I focused on building new client. As of now new client is ready for production (version 0.3.3 +) and server has some new functionality too, mainly to support new client.

There is now online demo automatically built from master branch (shared secret is mypass).

Main idea behind new client was to implement same caching possibilities as older Android client had. I felt this is really key feature to assure smoothness of listening experience and resilience to short term network unavailability and possibility to listen when offline.

Key technology for that is Service Worker and Cache Storage – this enables to cache ahead audio files and also store content of whole folder for off-line playback (provided that relevant api responses for that folder are in cache too – these are cached also via Service Worker into custom network-first cache).

It took me some time to play around with Service Worker, I wanted to build functionality from scratch to learn more about it and I also bit suspected that prepared solutions (like Workbox) would not work for may use case – I did have quite particular requirements for audio files caching. Some interesting problems I met:

  • You cannot rely on Service Worker life time, especially Firefox is quite aggressive on killing Service Worker, if client pages are not active.
    This caused problems, if there is a queue of longer files to be preloaded. I solved this with periodic polling of service worker while it is preloading files.
  • Regular Cache and Cache Storage filled via Service Worker can interfere in quite complex ways. Better to set Cache-Control to no-store for resources that are managed by custom cache.
  • Do not cache error responses – custom cache can cache everything, so do not cache errors like 401 (lesson learned :-). However 404s can be different story – see later discussion.

Another notable functionality in web client are folder icons. I’m not much of fan of too visual applications (text information is enough for my navigation), neither I do have consistent cover images in my collection, but it’s kind of standard for such an application so I added it in. It brought couple of interesting aspect:

  • I had to build scaling and caching service in audioserve server, so that icons are served efficiently. Initially icon is created from cover image stored in folder – it’s scaled to given size (configurable with 128px default) and then stored in cache (on file system, not memory, linux does good job on caching frequently used files, if there is enough memory). Next time same icon is served from cache.
  • This opened another yet unsolved problem – how to get cover image for single file audiobooks – .m4b files. So I extended media-info sub crate to extract cover from .m4b (where it is represented as another stream of video type and MJPEG codec).
  • On client side I implemented efficient loading of folder icons within subfolders list – using `IntersectionObserver as recommended by PWA guidelines. Interestingly provided third party solution for Svelte did not fit well. First reason is mentioned in the link, second reason came from need to be able to quickly cancel download of icon, that left view.
  • It also brought interesting issue of 404 responses caching – in my collection I do have quite a lot of folders without cover image – so 404 responses are returned for their icons. But since collection content is rather static, it would make sense, in this case, to cache them. I looked over internet articles and general opinion is that it make sense to cache them, if there is good reason for it. However practice is bit different – I had enabled cache via Cache-Control header on 404 response, but only Chrome supports it, Firefox is not caching 404s, even when it has proper header.
    (Indeed the proper solution would be to provide information from the server – if subfolder has icon or not)

I also had couple of issues with Chrome browser where I discovered some bugs – while first one with MediaSession implementation is still pending, second one – regression in range input was to my surprise resolved very quickly.

So this is where audioserve is now, near future plans are (mostly now driven by users feedback by github issues):

  • Add finished and last modified information to files (server) and appropriate support in web client
  • Solve icons 404s – just for fun, I’d like to play with 404 cache in client, but proper solution will also require change in collections cache on server.

For long term ideas, I was thinking about:

  • Improve Rust binding to libavformat and libavcodec so I can implement transcoding in Rust and have more control over it.
  • Use Web Audio API – something that will implement custom player and will interact better with server, especially for transcoded streams.
  • The two activities above may lead to DASH support of something like this – adaptable bandwidth and more reliable playback. Will see.

So that all for now – if you are using audioserve and have some ideas or problems please share them via GitHub issues.





Latest Images