Metrum Newsletter - August 2024
"Islands in the stream" - streaming options with Ambre and Hermes
Newsletter
There is a plethora of audio streaming devices available, all with differing functionality, options and sound quality. There are also differences in how you can connect them, and their aesthetics... Then what are the main audiophile features to look for? The user interface, clocking, asynchronous delivery of audio data (by using an external master clock) and isolation. For audiophile performance, timing is everything!
A quick what is what
Our audio streamers convert audio that is streamed from a network server and is then converted to a usable digital audio format, which is input to a DAC. Our Ambre, Baby Ambre and Hermes streamers are lightweight and require an operating system to work. Like Windows and Linux are available for PC-based systems, operating systems like Roon, Volumio, Audirvana and many others are available for single-board computers. When an operating system is installed, its features give access to your music library and/or streaming services like Tidal and Qobuz. If you can't see the forest for the trees (yet), in this newsletter we will explain what certain features encompass and what is needed to use them, as well as discuss the various operating systems. First, the differences between USB audio and networked audio products are explained.
Or is it forest through the trees?
USB audio products
The USB port on your computer can transport audio data to a USB enabled DAC. A player program is needed to play music. Foobar, VLC, JRiver, good old Winamp and even the ubiquitous Chrome browser... all play most formats of audio with ease. However, the sound quality you get from them is greatly determined by how you set the software (and thus route the audio stream). The Windows volume mixer usually resamples incoming audio streams to one sample rate and can mix different streams from various sources (i.e. notification sounds still play when you're watching a video). In the process, gains are changed digitally (to prevent clipping) and timing errors are introduced due to resampling. Therefore, driver types like WASAPI, ASIO and Kernel Streaming (KS) are preferred. They bypass digital volume regulation and allow a single application to reserve the sound card (so no gain change and no mixing). As added bonus, they offer bit-perfect playback. This can be less convenient in cases where a computer is used for YouTube content as well as high-end playback (at the same time). Luckily, it also makes it easy to detect when the Windows sound handler is used (noticeable by being able to regulate volume in Windows). Now on to the networked audio products...
Metrum USB module
Networked audio products
As discussed in our March 2024 newsletter, streaming is the future of Music delivery. A digital audio streamer is a form of a computer (usually an embedded system, running software that is called an operating system) that converts incoming digital audio from several formats (WAV, FLAC, MP3 to name a few) to usable SPDIF or (better) I2S output format. This is converted to analog by a Digital to Analog Converter to get usable analog sound (the stuff our ears work with). Audio streamers are often a so-called endpoint, which means that a server in your network does the heavy lifting. A protocol that is often used to stream audio over a network from a 'server' to an 'endpoint' is DNLA (which is an implementation of UPnP), although Roon uses its own RAAT system. The server can also download audio data from an external server that is located somewhere on the internet. The location of that server is less important, the idea is that the data arrives in a timely manner to your own server (to prevent stuttering sound), hence the word streaming (implying a constant stream). The audio streamer should also buffer enough audio data to be able to play without interruption and figure out where to get the audio data from via necessary addressing. Roon is very clever and separates the server and endpoint to take the load off of the audio streamer.
Sonnet Hermes – Roon certified audio streamer
Roon: Servers and Endpoints
The Roon server is the brain of the Music system. It can host your local Music files, connects with streaming services on the internet, allows the user to interface with the Roon ecosystem via phone, tablet or computer and finally, pushes audio data to the audio devices. The audio devices are called endpoints and are controlled by the server. The user interface is also offered to the server, so it is the most central 'audio thing' in the network. With Roon, your Music becomes a catalog of your Musical personality and you can read backstories and find artists associated with the Music you love. Several endpoints can be connected to the server, allowing you to play different Music on every endpoint, or the same Music through your entire house. This implies that the server does all the heavy lifting, while the endpoint can be kept lightweight. This is also the reason why our Ambre, Baby Ambre and Hermes streamers are low-power and are less bothered by interference. This architecture also allows upgrading the server side if needed, while retaining the endpoints.
Now we have a better understanding of what the whole chain is, let's dive into the intricacies of delivering audiophile quality. Timing is everything, but it is also important that what you put in is being put out: bit-perfect playback. We'll come back to the timing in due time...
Bit-perfect playback
A term that can be found everywhere which means that the bits of the digital signal that go into your system are unadulterated or tampered with. The recording as-is, is delivered to you True to Nature in the comfort of your own home. Sources of 'threats' to bit-perfect playback can be digital volume control implementations, sample rate conversion or the use of plugins. These will be discussed in more detail below.
Artist's impression of what conversions can do to 'flatten the sound'
Changing the volume in a computer (digitally) will result in a signal that is no longer bit-perfect. The digital signal is divided by the volume percentage, also resulting in irreversible rounding errors (leading to higher quantization noise and thus lowering resolution). Dithering can help to smooth over the rounding errors, but usually adds high frequency noise which is then filtered out. This has implications in the time-domain. Did we say timing before?
A player or sound infrastructure that converts the sample rate of your Music to one sample rate can increase usability (play multiple sounds at once), but is also no longer bit-perfect (unless the input is for that one - native - sample rate it samples all Music to). Usually, listeners are not aware that a mixer or sample rate converter is in place until you remove it and the veil they can put over the Music is lifted. For most audiophiles, there is no turning back after that.
Other 'threats' to bit-perfection are DSP plugins (Digital Signal Processing). A DSP plugin is usually inserted in the audio chain, which is a software construct where the digital audio signal path is cut and the plugin is inserted. Some audio enthusiasts use room correction, equalizers or other sound-enhancing software. There is nothing wrong with that if it improves your personal listening experience, but the result is not the original or the way the artist intended it. It is a matter of choosing the lesser evil. If you have a room that is cubical in form, an equalizer can be very welcome to dampen the exaggerated room nodes in the low-frequency range. Adding an equalizer to remove the 'bump' in the frequency response makes the listening experience much more enjoyable, but it can also introduce a phase shift in the analog domain that is not linear over the audible frequency range. That plays tricks on the timing, and the result is that the placement of instruments in an orchestra is not as clear anymore. It's all about choices, and most of the time you can't have one without the other. No filtering is preferred, but if it's needed, benign (low order) filtering is usually best.
Plugins are bad?
Please note that in recording studios multi-track recordings are usually processed with plugins. This encompasses equalizing, dynamic range compressing, de-essing, stereo expanding, inflating/maximizing to make recordings sound louder, bass processing (Loudness war turned into Bass war?) or auto-tune (for vocals). Most Popular Music albums are also mastered, which means that a mastering engineer dots the final i's and adds his or her special sauce to the mix. Even more interesting is that a master for YouTube can sound different than a master for radio transmission. Why is this the case? Because different streaming services use different codecs to alter their 'sound' and allow different loudness levels to combat the illustrious loudness war. YouTube uses lossy compression and allows you to have a loudness of -14 LUFS (other LUFS levels are also possible, see a list here), where radio transmissions of popular stations are usually stereo-enhanced and as "loud and spacious" as possible to stand out from the other stations. So, the practice of using plugins is not wrong at all. We're just explaining what the other effects are when using them. If you can't solve a problem in another way or don't hear a difference, consider the drawbacks and by all means, use plugins.
Due time just arrived... and is a subject that stirs a lot of conversation: clocking and the way a bad implementation can introduce jitter.
Asynchronous clocking
What is asynchronous clocking and why is this important? In the early days of USB audio, manufacturers worked with chips that made the necessary I2S clocks from the 12 MHz USB input clock. Mathematics tells us that 12 MHz is not a multiple of 44.1 kHz, so we can't divide the 12 million clock ticks into equal parts to obtain 44100 multiples without adding or removing time for a sample here and there. This is exactly what jitter is, a variation in timing. Twelve million ís however a multiple of 48kHz (250 times), so it wasn't weird that in the 'old days' sound in the computer was resampled to 48kHz. Higher jitter specifications or resampling lead to a less than optimal Musical experience, which results in 'sterile' or 'thin' digital sound and listening fatigue, like in the old days when we just started to use class B transistor amplifiers (the lack of biasing lead to cross-over distortion, making listening to it less pleasant). The modern approach is to use clocks that run on 22.5792 MHz and 24.576 MHz. The 22.5792 MHz clock is for the 44.1kHz multiples, whereas the other is for the 48kHz multiples (because also, no common divisor between the 'numbers' 44.1 and 48 is present). As an example, the 22.5792 MHz clock is divided by 8, which makes for a 2.8224 MHz bit clock, leading to 32 bits per sample, 44.1kHz stereo audio (44100 x 2 x 32 x 8 = 22579200). When switching songs with a different sample rate the computer or streamer needs to switch the clock and dividers on-the-fly. Techno bonus: if 16 or 24-bit audio is presented to the digital interface, the 'extra' bits of the 32 bits are filled with zeros. This process is called zero padding and doesn't change the sample rate, but it does keep the clocking scheme simpler.
The problem with external clocks is that computers aren't synchronous units, i.e. they usually "don't play nice" if they need to deliver data instantly (or when the user requests it). This is why buffers are implemented; to keep the audio bit pipe full. If a sample threatens to fall out, the buffer is used to fill up the gap. Asynchronous clocking, finally, is used to tell the computer exactly WHEN to put out data from the buffers, without dropping samples, without introducing jitter. Again, timing is everything, and helps to achieve that True to Nature sound! People that are old enough know how long a Windows second can take (Windows kept time by counting processor cycles, but was always off, depending on the processor load. With real-time clocks and internet synchronization that problem was also solved). Now, we enter the realm of operating systems, because a lot needs to happen internally to make the Music come out the way we want it to. As connaisseur Audiophiles, we are allowed to have demands...
Not an actual oscillator
Operating System and other software
The operating system and the modules that are used to decode the incoming data stream are of great influence on the sound quality. For high-end audio environments usually Linux is used (thank you Linus Torvalds and the open-source community) as the operating system. ALSA (Advanced Linux Sound Architecture) is the usual suspect to play our precious audio files using a series of CODECs (coding and decoding). A CODEC can deliver lossless or lossy performance, depending on input file type. MP3 is a form of lossy compression, whereas FLAC is lossless. This means that with FLAC you can revert to the original signal, whereas with MP3 you simply can't. Music is converted to the WAV format eventually, then converted to I2S and input to a DAC (this can be internal or external). An audio streamer usually sounds best when native I2S is output. More information on this topic is in our January 2024 newsletter. WAV is the mother of the PCM format, where Pulse Code Modulation is King in translating the samples that are taken with every Pulse into a Code that describes the signal deviation from the mean. If you change the air pressure, that is a Modulation, which can be seen in the Code (numbers) that is present in the WAV file. So, the main task of an operating system is to convert all incoming audio to PCM so the DAC can read it? Yes, and more... Because as a user you would also like to be able to tell the computer which track to play. And have some form of feedback, if that's not too much to ask. This implies that the computer (in any shape or form) can multi-task.
Resume
-
The operating system and audio structure do matter for the final sound quality. Tweaking sound path settings can give better sound.
-
Music players (and the sound infrastructure) can be bit-perfect or not.
-
Filtering and the use of plugins allows to improve the listening experience, but comes with drawbacks that need consideration.
-
The audio clock is of great influence on the perceived jitter (and thus the liveliness of the reproduction of Music).
-
Jitter figures should be as low as possible.
-
The best sound is achieved when native I2S is used. I2S has the unique ability to synchronously send the audio data signal and clocks.
-
Player software ideally has a good user interface and connects your phone or tablet as a remote.
Now let's find out what software is out there and what their specifics are. We did your homework for you so you can play with it. We listed common operating systems (and more) that can be used in dedicated audio streamers like the Ambre, Baby Ambre and Hermes*. You are free to play around with the various operating systems and options, although we can't offer support on third-party software. Bonus points for IT-savvy customers with a multi-boot environment that uses several players on one SD card.
Roon (in bold) is standard delivered with Metrum and Sonnet streamers. Differences in implementation are present though. Many of the Operating Systems listed below are derivatives of another player and thus share functionality. For example, RoPieee and RoPieeeXL are derivatives of Roon, LibreELEC is based on Kodi (where Kodi was based on XBMC). PiCore Player uses Squeezelite and Jellyfin is based on Emby (which is not in this list). This indicates that making a good-sounding and usable Music player is quite a feat.
Legend:
$ - paid version
V – Function available
X – Function not available
Note that BubbleUPnP, Glider and JPlay are not operating systems, but rather DLNA 'glue' that can be used to cement applications together. MConnect offers similar functionality for iOS devices. As random example, Tidal streaming can be implemented in GentooPlayer using BubbleUPnP.
Operating System or application |
Free / Paid |
I2S output |
Local content |
UPnP / DLNA |
Airplay |
Deezer |
Spotify |
Tidal |
Qobuz |
Audirvana |
$ |
V |
V |
V |
V |
X |
X |
V |
V |
GentooPlayer |
$ |
V |
V |
V |
V |
X |
V |
plugin |
X |
HiFiBerryOS |
Free |
V |
V |
V |
V |
X |
V |
X |
X |
Kodi = LibreELEC.tv |
Free |
V |
V |
V |
V |
X |
plugin |
X |
X |
Max2Play |
Free and paid |
V |
V |
plugin |
plugin |
plugin |
plugin |
X |
plugin |
MoOde Audio |
V |
V |
V |
V |
V |
X |
V |
V |
X |
Mopidy |
Free |
V |
V |
V |
X |
X |
V |
X |
X |
piCorePlayer |
Free |
V |
V |
V |
V |
V |
V |
V |
V |
Pi MusicBox |
Free |
V |
V |
V |
V |
X |
V |
X |
X |
Roon and RoonBridge |
$ |
V |
V |
V |
V |
X |
X |
V |
V |
RoPieee |
Free |
V |
X |
X |
X |
X |
X |
X |
X |
RoPieeeXL |
Free |
V |
X |
V |
V |
X |
V |
X |
V |
RuneAudio |
Free |
V |
V |
V |
V |
X |
V |
X |
X |
Volumio free |
Free |
V |
V |
V |
V |
X |
V |
X |
X |
Volumio Virtuoso / Premium |
$ |
V |
V |
V |
V |
X |
V |
V |
V |
BubbleUPnP (Android) |
Free and paid |
NA |
V |
V |
Only through 'audio loopback' streaming |
V |
V |
||
Note that BubbleUPnP, Glider and JPlay can aid in streaming Airplay, Deezer, Spotify, Tidal and Qobuz content via DLNA |
* Last updated on 28-11-2024. This table is free to use under CC BY-SA , please mention the original source if you use it. Information is given because customers requested it, we hope it's useful to you. Software distributors may change features or support. Note that not all software and functionality are directly tested by us with our audio streamers. Most paid services offer a free trial period. Roon also has a free trial period and regularly has discount offers. Logitech stopped supporting Squeezebox. The open-source community produced several alternatives, of which Lyrion Music Server is most commonly known. If you have a suggestion or a correction, an email to info@metrumacoustics.com is very much appreciated.
That sums it up quite nicely... Now how to set up a different operating system (OS) on your Metrum Ambre, Metrum Baby Ambre or Sonnet Hermes streamer?
You can find a detailed description with screenshots for Volumio on the Metrum website under Burning SD Card and Setup (upper right corner, under the Technical info heading).
Most (if not all) operating systems need settings, the most overlooked tips are below:
-
Use a fresh micro SD card (size > 8GB) that is formatted to FAT32. In case you would like to switch back to Roon later, you still have your Roon micro SD card.
-
I2S output: select that you have an I2S DAC with the HiFiBerry Digi+ Pro setting. This is not the same device but the settings are conveniently corresponding to the (baby) Ambre and Hermes streamers.
-
When flashing other software on the Metrum Ambre, the blue front panel LED will keep blinking, indicating that it is looking for updates. This is not a bug but a feature. There is a simple fix to turn the blue LED constantly on. Short instruction: add the line "dtoverlay=gpio-poweroff,gpiopin=27,active_low" to config.txt on the SD card with your computer.
Fun facts
- Streaming only got interesting after storage was no longer an issue (remember floppy discs? That's the save icon for GenZ-aged audio enthusiasts) and internet lines evolved to being fast enough.
- USB transfer and home networking speeds couldn't stay behind for long. We started USB1.0 with 1.5 Mbps, theoretically just enough to play 16-bit, 44.1kHz sound. Clocks were derived from the USB port, making the jitter performance something to not write home about.
- In the beginning days of what is now very normal, the internet was too slow and too expensive to play music through (modem goes beep-boop).
- The first sound chips for computers used oscillators to 'construct' sounds. The first 'wave play capable' sound cards for computer use came out around 1990.
- Around the mid-20th century, the first Digital Signal Processor made baby steps... It took around 3 decades to mature the technology. For techies: The joy of convolution is a Java applet that shows visually 'what is going on under the hood' and lets you play around with the concept of digital filters.
- The introduction of the humble Raspberry Pi single-board computer unleashed a streaming revolution (and so much more).
Closing words
A lot has happened in the relatively short history of streaming audio. When audiophiles first had to work with large files on hard drive, servers or USB drives, we got more and more spoiled with very useful functionality. Today our devices are more and more interconnected and you can connect internet streaming, Bluetooth, DLNA (WiFi and cabled), HDMI and other digital content to your DAC. Nowadays, streamers are more intelligent and deliver the Music and information you need, even over the internet. The interface is a very important feature when it comes to a purchase decision, however, most of our customers prioritize how it sounds. On our Metrum Facebook page an announcement is coupled to this newsletter. You can " Let us know in the comments" how you stream Music and what the most important features are that you need for a great listening experience.
We are also proud to announce that Metrum Acoustics and Hiend Panda partnered up for a Sonnet dealership in China. Check out their website https://hiendpanda.com/. 欢迎新经销商!
Keep listening to beautiful Music, we'll talk soon...
Team Metrum Acoustics