Cosmic Average wrote:According to the standards set up by the American Board of Recorded Evidence, you need "ten comparable words between two samples" to make a credible determination. The screams are just the word "help" repeated.
Thanks, that's good to know.
The experts, both of whom said they have testified in cases involving audio analysis, stressed they cannot say who was screaming. They have no samples of Martin's voice.
Such analysis could play a role should there be a criminal or civil case over Martin's death. Primeau, who said he uses a combination of critical listening skills and spectrum analysis, called voice identification "an exact science" that can help a legal team in court.
Yet standards set by the American Board of Recorded Evidence indicate "there must be at least 10 comparable words between two voice samples to reach a minimal decision criteria." While Zimmerman says more than that many words on his 911 call, the only one heard on the second is a cry for "help."
This is exactly why I am skeptical of this "evidence." There are huge obstacles to voice recognition analysis under perfect laboratory conditions, and in this they don't have NEARLY enough of a sample size to make any sort of determination. One scream of 'help', under stressful conditions, with less than optimal recording equipment. There is no way this should be at all admissible.
Further ...
Kamikazie Sith wrote:
I did some research into Tom Owen and his company. His company appears to be established and they've published numerous papers. I don't have any reason to doubt them.
I am still curious as to what techniques they used for the purposes of this analysis, I can't find a copy of any of their publications online.
According to this
INTERPOL report on forensic speech analysis (which, from 2001, MAY be out of date ... hard to know), there are three primary methods:
The first group consists of trained phoneticians. They rely primarily on a combination of auditory phonetic analysis and a variety of acoustic measurements, and will generally only consider themselves competent to analyse speech samples in their own native language ... Perhaps the main criticism of this type of approach is that it has a strong subjective element and does not easily lend itself to validation.
The second group consists of those who use a set of semi-automatic measurements of particular acoustic speech parameters such as vowel formants, articulation rate and the like, sometimes combined with the results of a detailed, largely auditory phonetic analysis by a human expert.
Most automatic speaker identification systems today use a form of Gaussian mixture modelling to characterise or 'model' the speech of the known, target speaker (i.e., frequently the suspect in a forensic application) and that of the unknown speaker (i.e., the perpetrator). In addition to this, a relevant speaker population is defined and a probability-density function of the speech variance of this set is calculated. What the method essentially sets out to do is determine how likely a degree of similarity or difference as found between the target speaker (say the suspect) and an unknown speaker (say the perpetrator) is to occur within the relevant population.
I assume from the description in that news article provided earlier that the last of the three was used in this case. The first two are heavily dependent on multiple utterances of the same phonemes, which would not be available in this instance. The third ... well, the Interpol report agrees with what I said earlier about the unreliability of this type of analysis:
As a result, speakers may not always be reliably distinguished, and the system will produce a certain proportion of false-positives. ... However, as in all biometric identification techniques, there is a trade-off between false-positives and false rejections, which means that a system that is biased towards reducing false-positives will tend to produce unacceptable levels of false rejections and/or report unrealistically low probability scores for matches.
The second problem is related to the extreme sensitivity to transmission channel effects of automatic procedures, including the effects of different handsets, telephone lines, GSM-coding and perception-based compression techniques as used in Minidisk players and compression formats like MPEG. Recent research by Schmidt Nielsen & Crystal [21] confirms that, while human listeners show tremendous individual variability in performance, on average they tend to slightly outperform current state-of-the-art speaker verification systems. More importantly, they found that it is especially when conditions deteriorate as a result of differences in transmission channels, the presence of background noise and the like that human listeners are clearly superior to automatic speaker verification algorithms. It is precisely these conditions that tend to prevail in the forensic context.
From my own professional experience with speech analysis, I would barely trust this type of forensic analysis under IDEAL lab conditions, never mind the extremely poor conditions we have in this instance.