Voice Verification System Based on Bark-frequency Cepstral Coefficient

Karisma Trinanda Putra


Data verification systems evolve towards a more natural system using biometric media. In daily interactions, human use voice as a tool to communicate with others. Voice charactheristic is also used as a tool to identify subjects who are speaking. The problem is that background noise and signal characteristics of each person which is unique, cause speaker classification process becomes more complex. To identify the speaker, we need to understand the speech signal feature extraction process. We developed the technology to extract voice characteristics of each speaker based on spectral analysis. This research is useful for the development of biometric-based security application. At first, the voice signal will be separated by a pause signal using voice activity detection. Then the voice characteristic will be extracted using a bark-frequency cepstral coefficient. Set of cepstral will be classified according to the speaker, using artificial neural network. The accuracy reached about 82% in voice recognition process with 10 speakers, meanwhile, the highest accuracy was 93% with only 1 speaker. 


artificial neural network, bark-frequency cepstral coefficient, voice activity detection

Full Text:



Fardana, A. R., Jain, S., Jovancevic, I., Suri, Y., Morand, C. and Robertson, N. M. (2013). “Controlling a Mobile Robot with Natural Commands based on Voice and Gesture”, Proceedings of the IEEE International Conference on Robotics and Automation (ICRA).

Barbu, T. (2010). “Gabor Filter-Based Face Recognition Technique”, Proceedings of the Romanian Academy, Series A, Volume 11, Romania.

Anitha, D., M.Suganthi, M., Suresh, P. (2011). “Image Processing of Eye to Identify the Iris Using Edge Detection Technique based on ROI and Edge Length” Proceedings of the International Conference on Signal, Image Processing and Applications (ICEEA), Singapore.

Purwanto, D., Mardiyanto, R., Arai, K. (2009). “Electric wheelchair control with gaze direction and eye blinking” Proceedings of the 14th International Symposium on Artificial Life and Robotics, Oita, Japan.

Damaryam, G., Dunbar, G. (2005). “A Mobile Robot Vision System for Self navigation using the Hough Transform and neural networks”, Proceedings of the EOS Conference on Industrial Imaging and Machine Vision, Munich, pp. 72.

Putra, K. T., Purwanto, D., Mardiyanto, R., (2015). “Indonesian Natural Voice Command for Robotic Applications”, Proceedings of the International Conference on Electrical Engineering and Informatics (ICEEI), Bali.

Jangmyung, L., MinCheol, L. (2013). “A Robust Control of Intelligent Mobile Robot Based on Voice Command”. Proceedings of the 6th International Conference, ICIRA.

Teller, S., Walter, M. R., Antone, M., Correa, A., Davis, R., Fletcher, L., Frazzoli, E., Glass, J., How, J. P., Huang, A. S., Jeon, J. H., Karaman, S., Luders, B., Roy, N., Sainath, T. (2010). “A Voice- Commandable Robotic Forklift Working Alongside Humans in Minimally-Prepared Outdoor Environments”, Proceedings of the Robotics and Automation (ICRA).

Kumar, P., Biswas, A., Mishra, A .N., and Chandra, M. (2010). "Spoken Language Identification Using Hybrid Feature Extraction Methods", Journal of Telecommunications. Volume 1. Issue 2.

Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., Lang, K. (1988). "Phoneme Recognition: Neural Networks vs Hidden Markov Models", Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP).


  • There are currently no refbacks.



Office Address:

Journal of Electrical Technology UMY

Department of Electrical Engineering, Universitas Muhammadiyah Yogyakarta

Jl. Brawijaya, Kasihan, Bantul, Daerah Istimewa Yogyakarta

Phone/Fax: +62274-387656/ +62274-387646, E-mail: ramadoni@umy.ac.id, jet@umy.ac.id