An Audio Founder Blind-Tested Three of His Own Amps and Couldn’t Spot the “Bad” One

We independently review all our recommendations. Purchases made via our links may earn us a commission. Learn more ❯

What he did next changed how his entire company evaluates audio gear.

Schiit Audio co-founder Jason Stoddard recently described a blind listening test between three of his company’s amplifiers. One of them measured far worse on paper than the others. So, he set up a short blind test to find out whether the numbers matched his ears.

Yet when he listened without knowing which amp was which, he could not hear a difference.

The experience, shared during an interview on the Passion for Sound YouTube channel, reinforced his long-standing view that the format and conditions of a test can strongly influence what listeners perceive, particularly in short-term blind tests.

What This Means for the Audiophile Debate?

A Blind Test That Challenged Expectations

The test itself was simple. Stoddard sat down with three Schiit amplifiers: the Magni, the Magni Heresy, and the Vali. All three were routed through a blind switcher and carefully level-matched.

On paper, the Vali seemed like the clear outlier. The tube hybrid amplifier measures around –40 dB THD+N. By conventional engineering expectations, that level of distortion should be clearly audible.

He switched through A, B, and C, then repeated the comparison. And each time, the sound appeared identical.

His first reaction was not that the amplifiers sounded the same. Instead, he assumed something in the setup had gone wrong. So, he double-checked the connections and the switcher with the engineer running the test, making sure that everything was functioning correctly.

“I went, that’s cool, everything’s shorted together because they sound all exactly the same,” he recalled. “And they just started laughing.”

After confirming the setup, it was clear: even the Vali, the amplifier that measured far worse than the others, sounded identical to the rest. This was true when everything was level-matched and compared over a short listening window.

Why Short Blind Tests Can Be Misleading

For Stoddard, that result emphasized what he sees as a broader problem with traditional blind listening tests.

Formats such as ABX comparisons, a common blind listening method, often compress the listening process into a very short period of time. Under those conditions, listeners may not be hearing the equipment the way they normally would. Plus, there is often pressure to reach a conclusion quickly.

“I could see everyone in the world failing that every time because it’s high pressure,” Stoddard said about fast switching tests. “It’s probably music you don’t know. It’s not at your preferred volume range. There’s a million things why that would fail.”

That matched what the interviewer had noticed in his own reviewing work. He noted that differences between devices often begin to emerge only after longer listening sessions.

“I’ll often start a comparison and go, ‘Yep, these are the same,’” he said. “Then I’ll keep switching back and forth and listen for different things. (…) It can take ages.”

Schiit’s Alternative: Long-Term Listening Comparisons

Recognizing these limitations, Stoddard and his team have systematized a different approach. Instead of relying on rapid switching tests, Schiit Audio sometimes sends out unlabeled units to listeners for extended evaluation.

Each device is labeled only with a letter such as A, B, X, or Y. The listeners are asked to use them in their normal listening environments.

“People can sit down for a week or a month, play their own music,” Stoddard explained. “Tell us which one you like better.”

Over time, patterns tend to emerge. People usually settle on a favorite, and these preferences often align across multiple listeners.

The company has also used this method as a control experiment. By sending out two identical units under different labels, they can see whether listeners perceive differences that aren’t actually there.

“A lot of people say, I don’t know. It could be either of them,” Stoddard said. “I don’t really care which one you do.”

Such responses suggest that the method prevents listeners from inventing differences when none exist.

What This Means for the Audiophile Debate

Stoddard does not argue that measurements or blind testing should be ignored. His point is that the structure of a test can strongly influence what listeners perceive.

Metrics such as THD+N capture real characteristics of a device. Whether those characteristics are audible depends on the listening context. Factors such as volume matching, program material, and listening duration can all influence perception.

Even Schiit’s own engineers recognize the limits of prediction. Stoddard recalled that Schiit’s head of R&D digital, Martin, once described the process in blunt terms.

“I had no idea in analog design how much of it is just guessing,” Martin told him.

In practice, engineers frequently test an idea, listen to the results, and then refine the design. This shows that even with measurements in hand, there is a lot of experimentation involved in analog audio.

Taken together, Stoddard believes that both measurements and listening have a role to play when evaluating audio equipment. Each provides useful information, but neither tells the entire story on its own.

For listeners and engineers alike, the lesson is clear: an amplifier that looks worse on paper may not reveal those differences in a short listening test. Extending the evaluation period, using familiar music, and removing pressure can uncover nuances that rapid testing misses.

Headphonesty

Alexandra Plesa

Source