Datatang provides clients with complete speech synthesis technology. Datatang is committed to collecting world’s language corpus and processing voice contents including gender,emotion, and speech transcription. Datatang offers high-quality voice data for Automatic Speech Recognition (ASR), Text to Speech (TTS), etc.
- Device: mobile/laptop/high fidelity microphone
- Recording scenarios: in car/in quiet room/on street
- Uploaded Contents through Internet: personal information/audio data
A typical procedure for speech transcription-annotation includes:
Machine transcription to generate text from speech.
Annotator proofreads results from step 1.
Annotator labels speech features (speaker’s gender, accent, time stamp, etc.)
Client Expectation ：A Client requires to collect speech data from 1000 Chinese native speakers at home for smart home research.
1. Total Number: 1000
2. Gender Balance: F/M=50%:50%
3. Accent Balance:
Recorders from 7 dialect areas
including Beijing, Tianjin, Pearl River Delta, Yangtze River Delta Region
4. Age Balance:
1. Various arrays of Microphone (far or near field)
including the current major arrays of Microphone:
6 Microphone-annular array, and 6+1 Microphone-annular array
2. Mobile Phones (near field)
at least 8 phone model
1. Room Selection
Three types of room: living room, bedroom, and kitchen.
Recorded in 10 set of houses.
Size of room: small (15m²-20m²), medium (20m²-30m²), large (30m²-40m²)
2. Microphone Location
Four recording points on the speech direction and from the source: 0.5m, 1m, 3m, 5m
3.3. Noice Collection
Daily noise at home: human voice noise, TV noise, household appliance noise