Undergraduate student majoring in Computer Science engineering with a focus on Bioinformatics
Bioinformatics, Music, Programming, Research
Objective Assessment of Pitch Accuracy in Vocal Music using Signal Processing Approaches
Objective assessment of pitch accuracy in vocal monophonic music using signal processing techniques has been shown in this work. 50 recordings of ‘arohan’ and ‘avarohan’ in 10 Hindustani Classical Ragas were analyzed to quantify the level of variation in pitch by 8 singers who have varied expertise in music. The F0 of the user renditions were estimated using the SWIPE Pitch Detection Algorithm. Each of the note onsets were identified computationally, which permitted proper objective measurement of error. Such an approach is often not accounted for in traditional musicological analysis.
Error in pitch is measured in ‘cents’ which is a unit of the most common tuning system for quantifying intonation in equal tempered music. It was observed that singers with professional training have deviations within 15 - 20 cents, and non-musicians have deviations above 50 cents. The interactive feedback also visually depicts how close the user’s pitch is to what he/she should have sung. Such an evaluation system with a quantitative approach will greatly aid the training process of singers.
Abstract: Authors: José M. Iñesta ; Darrell Conklin ; Rafael Ramírez Article URL: http://www.tandfonline.com/doi/full/10.1080/17459737.2016.1216369?ai=z4&mi=3fqos0&af=R Citation: Journal of Mathematics and Music Publication Date: 2016-10-17T09:55:51Z Journal: Journal of Mathematics and Music: Mathematical and Computational Approaches to Music Theory, Analysis, Composition and Performance
Pub.: 17 Oct '16, Pinned: 17 Nov '17
Abstract: Optical Music Recognition (OMR) is an important technology within Music Information Retrieval. Deep learning models show promising results on OMR tasks, but symbol-level annotated data sets of sufficient size to train such models are not available and difficult to develop. We present a deep learning architecture called a Convolutional Sequence-to-Sequence model to both move towards an end-to-end trainable OMR pipeline, and apply a learning process that trains on full sentences of sheet music instead of individually labeled symbols. The model is trained and evaluated on a human generated data set, with various image augmentations based on real-world scenarios. This data set is the first publicly available set in OMR research with sufficient size to train and evaluate deep learning models. With the introduced augmentations a pitch recognition accuracy of 81% and a duration accuracy of 94% is achieved, resulting in a note level accuracy of 80%. Finally, the model is compared to commercially available methods, showing a large improvements over these applications.
Pub.: 16 Jul '17, Pinned: 17 Nov '17
Abstract: The pitch is a crucial parameter in speech and music signals. However, due to severe noisy conditions, missing harmonics, unsuitable physical vibration, the determination of pitch presents a great challenge when desiring to get a good accuracy. In this paper, we propose a method for pitch estimation of speech and music sounds. Our method is based on the fast Fourier transform (FFT) of the multi-scale product (MP) provided by a feature auditory model of the sound signals. The auditory model simulates the spectral behaviour of the cochlea by a gammachirp filter-bank, and the out/middle ear filtering by a low-pass filter. For the two output channels, the FFT function of the MP is computed over frames. The MP is based on constituting the product of the speech and music wavelet transform coefficients at three scales. The experimental results show that our method estimates the pitch with high accuracy. Besides, our proposed method outperforms several other pitch detection algorithms in clean and noisy environments.
Pub.: 28 Nov '15, Pinned: 17 Nov '17
Abstract: One of the most important aspects of singing is the control of fundamental frequency.The effects on pitch inaccuracy, defined as the distance in cents in equally tempered tuning between the reference note and the sung note, of the following conditions were evaluated: (1) level of external feedback, (2) tempo (slow or fast), (3) articulation (legato or staccato), (4) tessitura (low, medium, or high), and (5) semi-phrase direction (ascending or descending).The subjects were 10 nonprofessional singers and 10 classically trained professional or semi-professional singers (10 men and 10 women). Subjects sang one octave and a fifth arpeggi with three different levels of external auditory feedback, two tempi, and two articulations (legato or staccato).It was observed that inaccuracy was greatest in the descending semi-phrase arpeggi produced at a fast tempo and with a staccato articulation, especially for nonprofessional singers. The magnitude of inaccuracy was also relatively large in the high tessitura relative to the low and the medium tessitura for such singers. Contrary to predictions, when external auditory feedback was strongly attenuated by the hearing protectors, nonprofessional singers showed greater pitch accuracy than in the other external feedback conditions. This finding indicates the importance of internal auditory feedback in pitch control.With an increase in training, the singer's pitch inaccuracy decreases.
Pub.: 08 Mar '16, Pinned: 17 Nov '17
Join Sparrho today to stay on top of science
Discover, organise and share research that matters to you