Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Software AI Network Science Technology

Scientists Train AI To Learn People's Voices, Then Generate Their Faces (livescience.com) 63

JustAnotherOldGuy shares a report from Live Science: An neural network named "Speech2Face" was trained by scientists on millions of educational videos from the internet that showed over 100,000 different people talking. From this dataset, Speech2Face learned associations between vocal cues and certain physical features in a human face, researchers wrote in a new study. The AI then used an audio clip to model a photorealistic face matching the voice, and the results are surprisingly close to the actual faces of the people whose voices it listened to. The faces generated by Speech2Face didn't precisely match the people behind the voices. But the images did usually capture the correct age ranges, ethnicities and genders of the individuals, according to the study. The findings have been published in the preprint journal arXiv but have not been peer-reviewed.
This discussion has been archived. No new comments can be posted.

Scientists Train AI To Learn People's Voices, Then Generate Their Faces

Comments Filter:
  • by PopeRatzo ( 965947 ) on Tuesday June 11, 2019 @09:28PM (#58748030) Journal

    So, the NSA could get an AI to learn my first wife's voice and then use it instead of waterboarding as a form of torture?

    Neat!

  • Code. (Score:4, Informative)

    by RyanFenton ( 230700 ) on Tuesday June 11, 2019 @10:21PM (#58748190)

    https://github.com/imatge-upc/... [github.com]

    It's in python - not my code, just found it through google.

    Ryan Fenton

  • This is like that bit in BlacKkKlansman all over again, when David Duke claims he can identify black people by the way they pronounce 'are'...

  • by Anonymous Coward

    The findings have been published in the preprint journal arXiv but have not been peer-reviewed.

    arXiv is not a journal, and the findings have not yet been published. When people place their manuscript on arXiv, it's usually the same time as they submit it to a peer-reviewed journal. In other words, arXiv is a de facto repository for manuscripts currently undergoing peer-review. It is incorrect to report it as if it's already published, but in a journal that is not peer-reviewed.

  • Scientists train AI not only to recognize voices and faces. According to Bloomberg's latest research [bloomberg.com], Amazon is nor working on the device that will be able to recognize people's emotions and even help build social relations. Sounds impressive, isn't it? However, the question is if we can rely on a machine in building our relations with other people...
  • This looks to me an association based on race and age more than it is a determination of a persons face.

    It's AI/ML used to identify a correlation between age race and gender in a persons speech and then it produces a generic output.

    what i'd be more interested in seeing is if they run it against 100 speech samples from women in the same region of the same race. let's see how it does then.

  • Will the phone sex industry survive clients finding out what the person they've been talking to actually looks like?
    • Will the phone sex industry survive clients finding out what the person they've been talking to actually looks like?

      I don't think 13 year olds really care what the person they're talking to actually looks like. They could look like a mule with a face plastered with blue waffles and a 13 yo "would hit it".

Real Programmers think better when playing Adventure or Rogue.

Working...