Editing Speech synthesis (section)

=== Digital sound-alikes ===
At the 2018 [[Conference on Neural Information Processing Systems]] (NeurIPS) researchers from [[Google]] presented the work 'Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis', which [[Transfer learning|transfers learning]] from [[speaker recognition|speaker verification]] to achieve text-to-speech synthesis, that can be made to sound almost like anybody from a speech sample of only 5 seconds.<ref name="GoogleLearningTransferToTTS2018">

{{Citation
 | last1 = Jia
 | first1 = Ye
 | last2 = Zhang
 | first2 = Yu
 | last3 = Weiss
 | first3 = Ron J.
 | title = Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
 | journal = [[Advances in Neural Information Processing Systems]]
 | volume = 31
 | pages = 4485–4495
 | date = 2018-06-12
 | language = en
 | arxiv = 1806.04558
 }}

</ref>

Also researchers from [[Baidu Research]] presented a [[voice cloning]] system with similar aims at the 2018 NeurIPS conference,<ref name="Baidu2018">

{{Citation
 | last1 =  Arık
 | first1 = Sercan Ö.
 | last2 = Chen
 | first2 = Jitong
 | last3 = Peng
 | first3 = Kainan
 | last4 = Ping
 | first4 = Wei
 | last5 = Zhou
 | first5 = Yanqi
 | title = Neural Voice Cloning with a Few Samples
 | journal = [[Advances in Neural Information Processing Systems]]
 | volume = 31
 | year =2018
 | url = http://papers.nips.cc/paper/8206-neural-voice-cloning-with-a-few-samples
 | arxiv = 1802.06006
 }}

</ref> though the result is rather unconvincing.

By 2019 the digital sound-alikes found their way to the hands of criminals as [[NortonLifeLock|Symantec]] researchers know of 3 cases where digital sound-alikes technology has been used for crime.<ref name="BBC2019">
{{cite web
 |url= https://www.bbc.com/news/technology-48908736
 |title= Fake voices 'help cyber-crooks steal cash'
 |date= 2019-07-08
 |website= [[bbc.com]]
 |publisher= [[BBC]]
 |access-date= 2019-09-11
 }}
</ref><ref name="WaPo2019">
{{cite news
 |url= https://www.washingtonpost.com/technology/2019/09/04/an-artificial-intelligence-first-voice-mimicking-software-reportedly-used-major-theft/
 |title= An artificial-intelligence first: Voice-mimicking software reportedly used in a major theft
 |last= Drew
 |first= Harwell
 |date= 2019-09-04
 |newspaper= Washington Post
 |access-date= 2019-09-08
 }}
</ref>

This increases the stress on the disinformation situation coupled with the facts that 
* [[Human image synthesis]] since the early 2000s has improved beyond the point of human's inability to tell a real human imaged with a real camera from a simulation of a human imaged with a simulation of a camera.
*  2D video forgery techniques were presented in 2016 that allow [[Real-time computing#Near real-time|near real-time]] counterfeiting of [[facial expressions]] in existing 2D video.<ref name="Thi2016">{{cite web
  | last = Thies
  | first = Justus
  | title = Face2Face: Real-time Face Capture and Reenactment of RGB Videos
  | publisher = Proc. Computer Vision and Pattern Recognition (CVPR), IEEE
  | year = 2016
  | url = http://www.graphics.stanford.edu/~niessner/thies2016face.html
  | access-date =  2016-06-18}}
</ref>
* In [[SIGGRAPH]] 2017 an audio driven digital look-alike of upper torso of Barack Obama was presented by researchers from [[University of Washington]]. It was driven only by a voice track as source data for the animation after the training phase to acquire [[lip sync]] and wider facial information from training material consisting of 2D videos with audio had been completed.<ref name="Suw2017">{{Citation
 | last1 = Suwajanakorn | first1 = Supasorn 
 | last2 = Seitz | first2 = Steven 
 | last3 = Kemelmacher-Shlizerman | first3 = Ira 
 | title = Synthesizing Obama: Learning Lip Sync from Audio
 | publisher = [[University of Washington]]
 | year = 2017 
 | url = http://grail.cs.washington.edu/projects/AudioToObama/
 | access-date = 2018-03-02 }}
</ref>
In March 2020, a [[freeware]] web application called 15.ai that generates high-quality voices from an assortment of fictional characters from a variety of media sources was released.<ref name="Batch042020">
{{cite web|last=Ng|first=Andrew|date=2020-04-01|title=Voice Cloning for the Masses|url=https://blog.deeplearning.ai/blog/the-batch-ai-against-coronavirus-datasets-voice-cloning-for-the-masses-finding-unexploded-bombs-seeing-see-through-objects-optimizing-training-parameters|url-status=dead|archive-url=https://web.archive.org/web/20200807111844/https://blog.deeplearning.ai/blog/the-batch-ai-against-coronavirus-datasets-voice-cloning-for-the-masses-finding-unexploded-bombs-seeing-see-through-objects-optimizing-training-parameters|archive-date=2020-08-07|access-date=2020-04-02|website=deeplearning.ai|publisher=The Batch}}
</ref> Initial characters included [[GLaDOS]] from ''[[Portal (series)|Portal]]'', [[Twilight Sparkle]] and [[Fluttershy]] from the show ''[[My Little Pony: Friendship Is Magic]]'', and the [[Tenth Doctor]] from ''[[Doctor Who]]''.