Sistem Pengenal Wicara Menggunakan Mel-Frequency Cepstral Coefficient

Authors

  • Karisma Trinanda Putra Universitas Muhammadiyah Yogyakarta

DOI:

https://doi.org/10.18196/st.v20i1.2358

Keywords:

voice activity detection, mel-frequency cepstral coefficient, artificial neural network

Abstract

Human-machine interaction evolves toward a more adaptive and interactive system. There are several media that can be used in human-machine interaction systems, such as voice signals. The process includes converting analog signals into the appropriate meaning, which depend on the noise and reliability of signal characteristic extraction methods. In fact, variations of pronunciation by different people will result in a diversity of voice signal patterns. This research develops technology that can recognize and translate speech according to data that has been trained and can be modified based on user requirement. The voice signal will be separated from the silent signal using voice activity detection. Then, the voice signal is converted to the frequency domain before it is extracted using mel-frequency cepstral coefficients. Cepstral value from MFCC extraction will be identified as words using artificial neural network. This study utilizes a computer with a microphone as a sound recording device and pascal programming language as the basis for building applications. Based on the experimental results, the accuracy is 87% on the speech recognition process with 28 vocabulary sets. Accuracy decreases with more sets of vocabulary. However, the more pronounced speech variations, the greater the accuracy with an average number around 93%.

Author Biography

Karisma Trinanda Putra, Universitas Muhammadiyah Yogyakarta

Departement of Electrical Engineering

References

Chen, D., and Manning, C. D. (2014), A Fast and Accurate Dependency Parser using Neural networks. Proceedings of the 2014 conference on Empirical Methods in Natural language processing (EMNLP).

Damaryam, G., Dunbar, G. (2005). A Mobile Robot Vision System for Self navigation using the Hough Transform and neural networks. Proceedings of the EOS Conference on Industrial Imaging and Machine Vision, Munich, pp. 72.

Fardana, A.R., Jain, S., Jovancevic, I., Suri, Y., Morand, C. and Robertson, N.M. (2013). Controlling a Mobile Robot with Natural Commands based on Voice and Gesture. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA).

Jangmyung, L., MinCheol, L. (2013). A Robust Control of Intelligent Mobile Robot Based on Voice Command. Proceedings of the 6th International Conference, ICIRA.

Kumar, P., Biswas, A., Mishra, A .N., and Chandra, M. (2010). Spoken Language Identification Using Hybrid Feature Extraction Methods. Journal of Telecommunications. Volume 1. Issue 2.

Mehl, M. R., Vazire, S., Ramírez-Esparza, N., Slatcher, R. B., Pennebaker, J. W. (2007). Are women really more talkative than men. Science 317 (5834), 82-82.

Olson, C. F., Matthies, L. H., Schoppers, M., Maimone, M. W. (2010). Rover navigation using stereo ego-motion. Robotics and Autonomous Systems 43 (4): page 215–229.

Purwanto, D., Mardiyanto, R., Arai, K. (2009). Electric wheelchair control with gaze direction and eye blinking. Proceedings of the 14th International Symposium on Artificial Life and Robotics, Oita, Japan.

Socher, R., Bauer, J., Manning, C., D., Yan-Tak Ng., A. (2013). Parsing with compositional vector grammars, Proceedings of the ACL conference.

Teller, S., Walter, M. R., Antone, M., Correa, A., Davis, R., Fletcher, L., Frazzoli, E., Glass, J., How, J. P., Huang, A. S., Jeon, J. H., Karaman, S., Luders, B., Roy, N., Sainath, T. (2010). A Voice-Commandable Robotic Forklift Working Alongside Humans in Minimally-Prepared Outdoor Environments. Proceedings of the Robotics and Automation (ICRA).

Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., Lang, K. (1988). Phoneme Recognition: Neural Networks vs Hidden Markov Models. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP).

Downloads

Published

2017-05-18

How to Cite

Putra, K. T. (2017). Sistem Pengenal Wicara Menggunakan Mel-Frequency Cepstral Coefficient. Semesta Teknika, 20(1), 75–80. https://doi.org/10.18196/st.v20i1.2358

Issue

Section

Articles