AOIP - Networked Audio

USB Audio & MIDI Overview

The transfer of Audio signals and MIDI data over USB is a well-established protocol. The USB Audio standard is now 3.0, not to be confused with USB 3.0, and was released by the USB-IF in the Fall of 2016. However, USB is also an imperfect medium for transfer. This article will detail information related primarily to music production workflows where USB transfer is involved. Understanding these challenges is especially important when routing audio between operating systems, hardware and software, and inter-application routing, for example. USB Audio and MIDI protocols, and more broadly, audio and MIDI routing in general, is a moving target. Changes in any aspect of the protocols to your device ecosystem will impact your workflow. As workflows themselves tend to be under construction, it’s helpful to analyze the relative merits of your transfer media, audio dropouts aren’t acceptable for live performance, nor ideal for production. Each of the topics are intended as touch points, ergo you can explore them in further depth as works for you.

the USB Audio 3.0 standard also puts in place ultrasonic audio standards. Ultrasonic audio operates outside the normal hearing spectrum. It also turns any USB port into a listening device, presuming the application has microphone access. This technology is especially relevant to advertising – ultrasonic beacons can triangulate on body position, listen, and exchange data over ultrasonic frequencies. Very little work has been done on privacy doctrine related to ultrasonics, as yet. Another network effect of the new USB Audio standard is the complete abstraction of the audio hardware level, from the processing requirements of the audio stream. To simply, this move is analogous to the removal of the headphone jack. By pushing the DAC outside of the iPhone it allows for control of what is coming in and what is going out of the USB-C connector. This can be good if you’re looking for a superb external DAC – but the DAC will be limited to whatever protocol is accepted by the phone. End users may benefit if higher quality devices are available, but will also be limited to certified providers – where cost of entry may preclude competitive innovation. I tend to think most of the chest pounding over Apple’s “reduced” professional experience is just that. The idea is simple, you get 40 gb’s of throughput –  you’re the professional, hook up whatever you want. Otherwise, your needs are too individualized, and you should custom build a system. The days of systems with 20 ports are over, now you get 2, and you can hook any 20 up downstream. Touch screen is kludgy, I have one. I’m a long time windows user. It’s nowhere near as good as the iPAD. While these are enjoyable debates, when you return to workflow you can count on being able to integrate just about any ecosystem, but you’ll have to dig into the weeds of protocols. While musicians may clamor for a full touch Mac experience, presuming there’s a few hundred thousand out there who’d pounce on it… I don’t know the numbers, but it seems reasonably minute amount of revenue for a user-experience that is risky and could be ill-received on the whole.

USB Audio & Data – An Array of Challenges:

1. Serial processing of data – No matter the speed of the USB protocol – now at 10 Gbps for USB-C, USB packets are processed one at a time. One packet follows another.

2. Interrupts – USB is a shared data stream – allocation of the stream is determined by the Host, interrupts can contribute to jitter. It’s a good idea to keep only necessary devices connected, a mouse or camera or anything polling the USB wire can degrade USB performance.

3. No error checking – USB data is transferred with no guarantee of arrival, and no method to check if a packet didn’t arrive.

4. CRC – Cylical Redundancy Checking is used in the USB protocol to validate USB packets. CRC utilizes a polynomial to validate, which regularly results in false positives, about every K times.

5. The Buffer – Everyone likes to recommend adjusting the Buffer. You can potentially adjust the buffer in a plugin, a DAW, an audio interface, your computer, other software, other computers, your cat probably even has a buffer. In iOS, for example, there are as many as 4 buffers for an input/output stream. Of course, usually these buffers aren’t transparent to end-users, making quantifying latency problematic, if not impossible. What appears as jitter, may be latency, or the converse.

So buffer adjustments aren’t going to work for the long haul. Maybe you add in more plugins, switch programs, unplug a device. Each of these will affect the buffer, as will the entirety of your MIDI and audio ecosystem. Buffering is a central aspect to reducing jitter and latency. Reasonably, one should take the approach suited to their needs. If you’re not getting audio drop-outs, and can record and perform without fear of these issues – you may consider this a good goal. If you want to lock down timing and Audio & MIDI latency and jitter to the bare minimum in your setup, then you’ll need to take the time to learn about the various buffer points in your audio/MIDI chain, and taking the appropriate steps to mitigate them.

6. Also remember for USB 3.0+ Audio using the Isochronous transfer, a maximum of 90% of the USB bandwidth can be allocated to the Isochronous stream. In USB 2.0, the % bandwidth drops to 80% of the maximum.

7. USB cable runs can be a maximum of 15′. The limited amount of power available to the USB protocol will attenuate past stability around that distance. But remember, USB is just electricity.

8. Electricity in fact, is an overlooked optimization factor. There are $500 USB cables – but look for cables with good shielding. Powering the computer, or your USB Hub or peripherals with conditioned power is also a good idea. Clean power does reduce jitter, inconsistent signals, and preserves your devices. If your location has power outages, conditioned power will prevent your studio from an immediate shutdown, by keeping power in reserve to shut your devices down when it recognizes power loss. And of course, live venues often have shoddy power. Over current protection is helpful! In fact, here’s an area where USB-C, with its 100W power limit, could help by powering devices requiring more than the an iPAD. I’d like to see a solar-powered USB-C Hub, and use it as a power bank, or backup (or my next camping trip).

9. There is much ado about the relative amounts of jitter and latency between Windows and Mac systems. However, if you’re willing to work with what you’ve got, you’ll be able to achieve a stable workflow, taking into account the overall specifications of your computer hardware. In my estimation, Windows is more challenging, because it’s not designed from the ground up as an Audio device, like CoreAudio is. However, with the proper optimizations it can be great. So too for Android, while iOS has a more stable audio platform, it’s still unstable, and while it has a more delineated music app ecosystem, there is an abundance of audio utilities and instruments on Android, you’ll just have to be more individualized to your device, and production needs – but it’s all there.

19. One complicated issue is using USB 1.1, 2.0, and 3.0, 3.1, and USB-C together. If you’re using a Hub, or any setup with multiple USB protocol devices, be sure to understand how they work together. Simply put, USB 1.1 works fine on USB 2.0 & 3.0, USB 2.0 works on 3.0, but USB-C isn’t backward compatible with either – it’s a different sized receptacle.

20. USB 1.1<–>2.0. When a USB 1.1 device (like many synthesizers current and past use) connects to a USB 2.0 port, the 2.0 port Upstreams the 1.1’s data. The data will still only travel at USB 1.1’s maximum 12.1 Mbps rate. So be sure to keep the slowest device in the chain, as downstream as possible, as nothing after it will be able to travel faster.

21. USB 2.0<–>3.0. When a USB 2.0 port is connected to a 3.0 port, its data is still transferred on the 2.0 pipe, as USB 3.0 cables have separate pin outs for the 2.0 and 3.0 streams. This ensures the USB 2.0 data can transfer at its maximum speed. Of course, any device downstream will only transfer data as fast as the device before it. So it’s a good idea to downstream devices by both their speed, and how much data you’ll be transferring from the device.  Here’s an image of a USB 3.0 pinout. The top 4 pins are for the backward compatibility with 1.1 & 2.0, while the bottom 5 are for USB 3.0. connector_usb_3_imgp6024_wp

Time Code – Getting timecode correct is like the black art of audio. If you’re routing audio and MIDI between apps, operating systems, computers, tablets – any one to n-other, you should take the time to learn about the clocking systems involved. Which computer or software or hardware will be your master clock? How can you determine which is best? How can you be sure the various clock signals are in sync, and continue to stay in sync? If you add a device or software into a set or recording session, how will its clock be managed? USB audio and MIDI is not the best for clock signals. Taking into considerations protocol aspects like above mentioned interrupts, and lack of guaranteed delivery, clock is complicated. Clock drift also occurs, resulting in sample rates being taken across the length of the USB frame, instead of always sampling the middle of the frame, which is optimal for timing.

Individually, any of the above factors can be potentially waved off. Many will claim, “it’s close enough,” or that anything under 10ms latency isn’t discernible. However, if you’re hitting 8ms, then you’re hitting 10, probably 12. You will have variance, otherwise you’d have no latency – there is only imperfect control.

Another consideration is transfer protocols as they relate to your workflow. If you’re looking for a rock solid no frills system, more traditional audio and DIN MIDI solutions might be appropriate. But if you’re using Lightning to USB to connect your i-Device back to your Mac or PC, or USB OTG with your Android Tablet or phone, or between a Mac and a PC, a USB external host, then it’s to your benefit to rigorously test and optimize your system’s data flow.

It’s also important to recognize USB as a moving protocol. This is important – the USB protocol, outside of how it handles audio, interacts with your computer in a variety of ways, and there are many opportunities for developer customization, and subsequently, error. This too, can be the reason behind issues with the transport of USB Audio & MIDI, and computer- processed music in general. The way in which, for example, new devices are enumerated, may be inconsistently implemented, resulting in devices hanging, or being unable to reconnect without resetting the device. If you’re bringing together some ecosystem of devices, expect to learn the specifics of all your devices, and how they interact.

More to Consider

This article points out some of the considerations music producers can utilize to set up or optimize their systems. There are many aspects to consider, and further articles will explain them individually at a higher level of detail.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s