Speech processing
Encyclopedia
Speech processing is the study of speech signals and the processing methods of these signals.

The signals are usually processed in a digital
Digital
A digital system is a data technology that uses discrete values. By contrast, non-digital systems use a continuous range of values to represent information...

 representation, so speech processing can be regarded as a special case of digital signal processing
Digital signal processing
Digital signal processing is concerned with the representation of discrete time signals by a sequence of numbers or symbols and the processing of these signals. Digital signal processing and analog signal processing are subfields of signal processing...

, applied to speech signal.

It is also closely tied to natural language processing
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....

 (NLP), as its input can come from / output can go to NLP applications. E.g. text-to-speech synthesis may use a syntactic parser on its input text and speech recognition
Speech recognition
Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...

's output may be used by e.g. information extraction
Information extraction
Information extraction is a type of information retrieval whose goal is to automatically extract structured information from unstructured and/or semi-structured machine-readable documents. In most of the cases this activity concerns processing human language texts by means of natural language...

 techniques.

Speech processing can be divided into the following categories:
  • Speech recognition
    Speech recognition
    Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...

    , which deals with analysis of the linguistic
    Linguistics
    Linguistics is the scientific study of human language. Linguistics can be broadly broken into three categories or subfields of study: language form, language meaning, and language in context....

     content of a speech signal.
  • Speaker recognition
    Speaker recognition
    Speaker recognition is the computing task of validating a user's claimed identity using characteristics extracted from their voices .There is a difference between speaker recognition and speech recognition . These two terms are frequently confused, as is voice recognition...

    , where the aim is to recognize the identity
    Identity (social science)
    Identity is a term used to describe a person's conception and expression of their individuality or group affiliations . The term is used more specifically in psychology and sociology, and is given a great deal of attention in social psychology...

     of the speaker.
  • Speech coding
    Speech coding
    Speech coding is the application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting...

    , a specialized form of data compression
    Data compression
    In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....

    , is important in the telecommunication
    Telecommunication
    Telecommunication is the transmission of information over significant distances to communicate. In earlier times, telecommunications involved the use of visual signals, such as beacons, smoke signals, semaphore telegraphs, signal flags, and optical heliographs, or audio messages via coded...

     area.
  • Voice analysis
    Voice analysis
    Voice analysis is the study of speech sounds for purposes other than linguistic content, such as in speech recognition. Such studies include mostly medical analysis of the voice i.e. phoniatrics, but also speaker identification...

     for medical purposes, such as analysis of vocal loading
    Vocal loading
    Vocal loading is the stress inflicted on the speech organs when speaking for long periods.- Background :Of the working population, about 15% have professions where their voice is their primary tool. That includes professions such as teachers, sales personnel, actors and singers, and TV and radio...

     and dysfunction of the vocal cords.
  • Speech synthesis
    Speech synthesis
    Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...

    : the artificial synthesis of speech, which usually means computer-generated speech.
  • Speech enhancement
    Speech enhancement
    Speech enhancement aims to improve speech quality by using various algorithms.The objective of enhancement is improvement in intelligibility and/or overall perceptual quality of degraded speech signal using audio signal processing techniques....

    : enhancing the intelligibility
    Intelligibility
    In phonetics, Intelligibility is a measure of how comprehendible speech is, or the degree to which speech can be understood. Intelligibility is affected by spoken clarity, explicitness, lucidity, comprehensibility, perspicuity, and precision.-Noise levels:...

     and/or perceptual quality of a speech signal, like audio noise reduction for audio signals.

See also

  • Audio signal processing
    Audio signal processing
    Audio signal processing, sometimes referred to as audio processing, is the intentional alteration of auditory signals, or sound. As audio signals may be electronically represented in either digital or analog format, signal processing may occur in either domain...

  • Linguistics
    Linguistics
    Linguistics is the scientific study of human language. Linguistics can be broadly broken into three categories or subfields of study: language form, language meaning, and language in context....

  • Phonetics
    Phonetics
    Phonetics is a branch of linguistics that comprises the study of the sounds of human speech, or—in the case of sign languages—the equivalent aspects of sign. It is concerned with the physical properties of speech sounds or signs : their physiological production, acoustic properties, auditory...

  • Speech impediment
  • Speech signal processing
    Speech signal processing
    Speech signal processing refers to the acquisition, manipulation, storage, transfer and output of vocal utterances by a computer. The main applications are the recognition, synthesis and compression of human speech:...

  • Speech interface guideline
    Speech interface guideline
    Speech interface guideline is a guideline with the aim for guiding decisions and criteria regarding designing interfaces operated by human voice. Speech interface system has many advantages such as consistent service and saving cost. However, for users, listening is a difficult task. It can become...

  • Packet loss concealment
    Packet Loss Concealment
    Packet loss concealment is a technique to mask the effects of packet loss in VoIP communications. Because the voice signal is sent as packets on a VoIP network, they may travel different routes to get to destination. At the receiver a packet might arrive very late, corrupted or simply might not...

  • Utterance
    Utterance
    In spoken language analysis an utterance is a complete unit of speech. It is generally but not always bounded by silence.It can be represented and delineated in written language in many ways. Note that in such areas of research utterances do not exist in written language, only their representations...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK