Communication Acoustics - Ville Pulkki - E-Book

Communication Acoustics E-Book

Ville Pulkki

0,0
85,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

In communication acoustics, the communication channel consists of a sound source, a channel (acoustic and/or electric) and finally the receiver: the human auditory system, a complex and intricate system that shapes the way sound is heard. Thus, when developing techniques in communication acoustics, such as in speech, audio and aided hearing, it is important to understand the time–frequency–space resolution of hearing.

This book facilitates the reader’s understanding and development of speech and audio techniques based on our knowledge of the auditory perceptual mechanisms by introducing the physical, signal-processing and psychophysical background to communication acoustics. It then provides a detailed explanation of sound technologies where a human listener is involved, including audio and speech techniques, sound quality measurement, hearing aids and audiology.

Key features:

  • Explains perceptually-based audio: the authors take a detailed but accessible engineering perspective on sound and hearing with a focus on the human place in the audio communications signal chain, from psychoacoustics and audiology to optimizing digital signal processing for human listening.
  • Presents a wide overview of speech, from the human production of speech sounds and basics of phonetics to major speech technologies, recognition and synthesis of speech and methods for speech quality evaluation.
  • Includes MATLAB examples that serve as an excellent basis for the reader’s own investigations into communication acoustics interaction schemes which intuitively combine touch, vision and voice for lifelike interactions.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 915

Veröffentlichungsjahr: 2015

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



COMMUNICATION ACOUSTICS

AN INTRODUCTION TO SPEECH, AUDIO AND PSYCHOACOUSTICS

Ville Pulkki and Matti Karjalainen

Aalto University, Finland

This edition first published 2015© 2015 John Wiley & Sons, Ltd

Registered officeJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom

For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.

The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought

MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of MATLAB® software or related products does not constitute endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular use of the MATLAB® software.

Library of Congress Cataloging-in-Publication DataPulkki, Ville.Communication acoustics : an introduction to speech, audio, and psychoacoustics / Ville Pulkki and Matti Karjalainen.  pages cm Includes index.

 ISBN 978-1-118-86654-2 (hardback)1. Sound–Recording and reproducing. 2. Hearing. 3. Psychoacoustics. 4. Sound. I. Karjalainen, Matti, 1946-2010. II. Title. TK7881.4.P85 2015 620.2–dc23

    2014030757

A catalogue record for this book is available from the British Library.

To Vappu, Sampo, Raisa, and HUT people

To Sari, Veikko and Sannakaisa, and Aalto Acoustics people

About the Authors

Ville Pulkki has been working in the field of audio from 1995. In his PhD thesis (2001) he developed a method to position virtual sources for 3D loudspeaker set-ups after researching the method using psychoacoustic listening tests and binaural computational models of human hearing. Later he worked on the reproduction of recorded spatial sound scenarios, on the measurement of head-related acoustics and on the measurement of room acoustics with laser-induced pressure pulses. Currently he holds a tenure-track assistant professor position in Aalto University and runs a research group with 18 researchers. He is a fellow of the Audio Engineering Society (AES) and has received the AES Publication Award. He has also received the Samuel L. Warner memorial medal from the Society of Motion Picture and Television Engineers (SMPTE). He has a background in music, having received teaching from the Sibelius Academy in singing and audio engineering alongside instruction in various musical instruments. He has also composed and arranged music for many different ensembles. He enjoys being with his family, renovating his summerhouse and dancing to hip hop.

Matti Karjalainen (1946–2010) began his career as an associate professor of acoustics at the Helsinki University of Technology (TKK) in the 1980s. He maintained a long and prolific career as a researcher and visionary leader in acoustic and audio signal processing, both as a pioneer of Finnish language speech synthesis and developer of the first portable microprocessor-based text-to-speech synthesizer in the world. For ten years he was Finland's only university professor of acoustics, leading the Laboratory of Acoustics and Audio Signal Processing at the TKK (now Aalto University) until 2006. Some of his groundbreaking work included applying his expert knowledge of psychoacoustics to computational auditory models, as well as sophisticated physical modelling of stringed instruments utilizing the fractional delay filter design for tuning, a now standard technique in this field. Later in life, augmented reality audio and spatial audio signal processing remained among Matti's greatest research interests. For his achievements in audio signal processing Matti received the Audio Engineering Society fellowship in 1999, the AES silver medal in 2006 and the IEEE fellowship in 2009. On his 60th birthday he founded the Matti Karjalainen Fund, supporting young students into studying acoustics. Matti's share of the revenues from this book are routed to the fund. In May 2010, Matti passed away at home, survived by his wife, daughter and son Sampo, a well-known software designer living in the US.

Preface

The book Kommunikaatioakustiikka by Matti Karjalainen (1946–2010) has always been around during my research career in audio and psychoacoustics, starting from my PhD studies (1996–2001), through the periods when I was a postdoc (2001–2005), a senior researcher (2005–2012), and now during my tenure track professorship (2012–). I first used the book as a reference, as it summarized many relevant topics and provided good pointers on where to find more information. I have also been teaching the corresponding course at Aalto University (the university formerly called Helsinki University of Technology), first during Matti's sabbatical years, then sick leaves, and regularly after his passing away. It was my and many other people's opinion that the book was great, but it did not have a counterpart written in English. Matti himself knew this, and he worked on a translation, a of which he completed about 30% in 2002, including the preface that follows.

For a long time I thought that I should finish Matti's work, as it would benefit people in the fields of audio and speech. However, I also understood that it would be quite a hard job. The final motivation came from my university, which stated that all MSc-level teaching should have course material in English as well, starting from autumn 2015. So, in autumn 2013, I decided to complete the book. To ensure that I would really do it, I proposed the book to Wiley, since I understood that I needed a deadline. I also thought that international distribution would benefit the propagation of the book. The book project meant a period of 10 months where I worked so much that I felt that my hands were stuck permanently to my laptop.

The book grew by about 30% from the original Finnish book, as I added quite a bit more material on audio techniques and updated many parts of the book. Consequently, the subject matter of the book might be too large for a single-semester course. Teachers are encouraged to leave some chapters out, as the whole book might be too much information to be digested in one go. I shall be updating the companion web page of this book with sound examples and other material to help teachers of such courses.

The book covers many fields within acoustics, and without great help from many professionals in the field, the book would be less detailed and less complete. First of all, I received very kind help in translating and updating the text from my PhD students Marko Takanen (Chapter 13), Teemu Koski (Chapter 19), and Olli Rummukainen (Section 11.7). Juha Vilkamo and Marko Takanen also provided text and figures from their PhD theses. The following professionals have read and commented on, or otherwise helped with, the project: Paavo Alku, Brian C. J. Moore, Mikko Kurimo, Ville Sivonen, Nelli Salminen, Ilkka Huhtakallio, Cleopatra Pike, Catarina Hiipakka, Alessandro Altoè, Mikko-Ville Laitinen, Søren Bech, Archontis Politis, Olli Santala, Sascha Disch, Tapio Lokki, Lauri Savioja, Hannu Pulakka, Richard Furse, Unto K. Laine, Vesa Välimäki, Javier Gómez Bolaños, Cumhur Erkut, Damian Murphy, Simon Christiansen, Jesper Ramsgaard, Bastian Epp, Athanasios Mouchtaris, Nikos Stefanakis, Antti Kelloniemi, Kalle Koivuniemi, Ercan Altinsoy, Lauri Juvela, Symeon Delikaris-Manias, Tapani Pihlajamäki, Antti Jylhä, Tuomo Raitio, Martti Vainio, Gaëtan Lorho, Mari Tervaniemi, Antti Kuusinen, Jouni Pohjalainen, Christian Uhle, Torben Poulsen, Davide Rocchesso, Nick Zacharov, and Thibaud Necciari. Luis R. J. Costa worked on removing the worst Finglishisms in the book, making them into more readable English expressions.

I, of course, hope that the book is successful, and new editions come out in time. With that in mind following the tradition started by Brian C. J. Moore in his Introduction to Psychology of Hearing, I would hereby like to open a similar contest. A prize of a box of Finnish chocolate confections will be awarded to the reader who spots the most errors in this edition, and writes to me to point them out. Game on!

I hope you will enjoy reading the book, and that you will find it beneficial in your research work and studies.

Ville PulkkiOtaniemi, Espoo, FinlandMay 2014

Introduction

Efficient use of sensory functions and communication has been one of the most important factors in the evolution and survival of animals in nature. Especially for the highest forms of evolution, vision and hearing are the two main modalities to support this view in a complementary way. Visual information, based on the laws of optics, reflects the environment in a geometrically appropriate and reliable manner, while auditory sensing and perception, based on the laws of acoustics, are less dependent on physical constraints such as obstacles between an observer and objects to be perceived. Vision often dominates audition, especially if an object is clearly visible or moving, while hearing may capture important information even when there are no perceivable visual events. Sensory integration, i.e., fusion of information from different modalities into a coherent percept, is characteristic of living species. Only when senses provide conflicting cues must they compete for contribution to the final percept.

Two main ways of utilizing the auditory sense are to sense orientation in the environment and for communication between subjects. The former activity can be found among early phases of animal evolution. As an example, the sound events shown in Figure I.1 bring information to the subject about the surroundings. The sounds caused by the shoes of the horse imply the type of terrain, the wind sounds bring information about the weather, and the sounds caused by animals, even from visually obscure locations, report their presence, action, and location. Any of the sounds may reach conscious attention and may startle the subject, with resulting reactions.

Figure I.1 Environmental orientation; a situation where information from external objects is carried by sound. The listener can both localize the sound sources and also decipher the cause of the sound.

Species with highly advanced and specialized hearing abilities have evolved, for example the echolocating bat. This animal sends chirps - frequency sweeps - and receives their reflections from surrounding objects. Auditory analysis of the echoes enables the bat to construct an image-like representation of its environment for navigation, even in fast flying. Many animals have a very sensitive, accurate, or specialized auditory system. The hearing system is important as an early warning indicator of dangerous situations or as an aid for hunting.

Sound is an excellent means for communication. Uttering of sounds is an easy way to warn others or to express the internal state of the subject, such as emotions, action plans, etc. Gestures and facial expressions are useful only when there are no limitations for visual communication. Sound can, in favourable conditions, carry relatively far and propagate around a visually opaque object. One of the major disadvantages of sounds and voice is that they do not leave a permanent physical trace like a footprint in sand. Thus, animals are not able to use sounds as message records to transfer sound-based information in time.

Orientation and communication by sound are activities inherent to human beings as well. Orientation is often instinctive, without conscious attention. We receive continuously a multitude of sound information, but most of these data remain outside of our consciousness. Sounds that are unexpected or otherwise in the focus of attention can be analysed and memorized in more detail over long periods of time. If a sound is annoying, disturbing, or just so loud that it can be harmful to the hearing of a subject, it is called noise.

The human being has evolved into a being with more advanced communication abilities than other living beings. Voice production evolved towards speech and spoken language. Prerequisites for this were the development of organs for speech production and the auditory ability to analyse complex voice signals that carry linguistic and conceptual information and knowledge. Only later did man discover systematic ways to store linguistic information in written form. Even today there are spoken languages without a corresponding written language.

Speech is a fast and flexible way of expressing conceptually structured information, emotions, and intentions, as illustrated in Figure I.2. A spoken message consists of linguistic and non-linguistic information. Linguistic information is built of basic units (phonemes) and their combinations (words, phrases, sentences). Non-linguistic features, such as speaker identity and pitch - expressing emotions, are an integral part of speech and may even change the interpretation of linguistic content. Speech contains a lot of redundancy that is, multiple ways of coding the same information in order to function properly in adverse acoustic environments, and it is not dependent on the visibility of the speaker. A fundamental requirement for successful communication is a common code – a common language or dialect and a common conceptual model of the world.

Figure I.2 Speech communication in different situations between subjects or from the presenters to the audience. The acoustic waves carry the information either directly from the presenter to the listener or through an audio system.

Humans have developed another important type of communication by sound: music. It is not primarily for conveying linguistic and conceptual information but rather for evoking aesthetic and emotional experiences, as in Figure I.3. Music may, however, also carry strong symbolic meanings between subjects that share common musical associations to experiences and events in their cultural or social life.

Figure I.3 Musical communication with electronic sound reinforcement. The audience responds acoustically to the band by clapping hands and with their voices.

Human beings were not satisfied with the limitations of acoustic communication where a long distance was a problem and no physical trace of sound was left to convey a message in time. The first sound-recording devices were based on mechanical principles. Only through discoveries in electricity and electronics did the techniques of recording and long-distance communication of sound and voice become everyday utilities. The first devices to extend the communication range were the telephone and the radio, as shown in Figure I.4. Acoustic waves were converted to corresponding electrical signals by a microphone. Weak signals from a microphone were strengthened using electronic amplifiers. By compensating for losses in telephone lines by amplification, it became possible to transmit speech over any distance. The radio was invented for wireless broadcasting over a long range from a transmitter.

Figure I.4 Speech communication through a technical transmission channel.

Mechanical cutting using sound waveforms enabled the first recording and playback by the phonograph and the record player. Electronic amplification improved the quality of sound and made it louder. A step forward was the tape recorder, in which the only remaining mechanical function was moving the magnetic tape. Finally, digital signal processing and computers have enabled storage of sound as bits on digital media, even without any moving parts. Digital documents are, in principle, perfect, in the sense that they can be copied and stored infinitely without any loss of information. Digital signal processing further enabled digital audio and speech processing to store and transmit signals economically using audio and speech coding, where the number of bits needed is reduced by an order of magnitude. In spite of rapid digitalization, the interface to humans still remains non-digital. Analogue components are needed: microphones and amplifiers to capture sound; amplifiers together with loudspeakers or headphones for sound reproduction, to make signals audible and loud enough.

Two very recent major steps in communications are the Internet, which is a data network to provide all forms of digital information, and the cellular wireless networks for mobile communications. Both of them enable, especially when they are integrated, access to new formats of multimedia, including sound and voice in their most advanced forms. Wireless networks allow such communication in most parts of the world, anytime, for a majority of people.

Early in the history of sound reproduction, one of the goals was to create a realistic spatial impression. Two-channel stereo was adopted in the 1960s to provide a better sound image and a more natural sound colour perception with two ears than was possible with monophonic reproduction that is, with a single channel. Different multi-channel systems with or without elevated loudspeakers have been proposed, and nowadays a wide variety of systems is available for spatial sound reproduction. Advanced techniques for headphone listening are also available.

Generally, digital audio means all methods of sound recording, processing, synthesis, or reproduction where digital signal processing and digital processors are utilized. Perceptually-based audio techniques