SMART-In English: Learn English Using Speech Recognition

English is an international language and important to learn. For someone learning English sometimes is a difficulty, especially in pronunciation. Therefore, SMART-In is a prototype of Android App that uses Speech Recognition technology by utilizing services from the Cloud Speech API (Application Programming Interface). SMART-In English can be used as an alternative to English learning, especially in the pronunciation of a word. Using speech recognition can display the score of the pronunciation spoken by the user, recorded, show a level the pronunciation of the word and display the correct pronunciation. Keywords— SMART-In English, Pronunciation, Speech Recognition, Cloud Speech.


INTRODUCTION
For some people will find it difficult to Learning vocabulary pronunciation in English that is not an original language. Differences in the habit of speaking are one of the causes of frequent difficulties in English pronunciation.
The development of technology, can be applied in various aspects of life, such as creating tools, use in industry [6], [7], to creating renewable energy [8] to help create learning media. Learning Method of pronunciation is increasing and growing, in this study build a prototype SMART-In English, as a medium of learning using Android-based mobile application. The prototype is built using speech recognition technology that utilizes Cloud Speech API (Application Programming Interface).
Google as the largest and number one search engine in the world and continues to evolve its services. Cloud Speech API is part of the Google Cloud Platform, a service that consists of the main components to build cloud-based applications. These services are Google AppEngine, Google Compute Engine, Google Cloud Storage and Google BigQuery, all these services are intended for developers who want to integrate Google services into their apps [9] [10] [11] [12].
In Pre-trained Machine Learning Models, Google Translate API and Cloud Vision API, have been integrated into the Google Cloud Speech API. With such a complete API, developers can develop applications that can view, hear, and translate [13].
The Cloud API determines how apps communicate with cloud computing. Cloud API offers a way the apps can request information from the platform and use it's the facilities [14]. Cloud Computing is the use of computer technology and the development of Internet-based services that allow accessing on demand to large collection of resources that can be set quickly with minimal management efforts [15] [16].
The utilization of the cloud API makes it possible to apply in education and teaching and learning activities [10]. For example, the use of online document editors using the Web API [17]. It is useful for the development of learning activities and the development of knowledge to students as well as increasing motivation [18]. Some of the benefits of the Google Cloud API service in education, which provides motivation, fun, capabilities in information technology, ease of use, cost savings, protected privacy, guaranteed security and new breakthroughs or innovation [19].
Google Cloud Speech API can be used as media development Learn English pronunciation vocabulary using speech recognition technology. Speech Recognition is the process of converting voice signals into machine language in the form of digital data (usually simple text) [20] [21], is an interdisciplinary subfield of computational linguistics that develops technologies and methodologies that enables the recognition and translation of spoken language and voice into text by computers or applications [22] [23]. Voice recognition [24]- [28] implies the ability to match patterns from acquired or acquired vocabularies to sound signals into the proper form so that computers or other machines can recognize words spoken by humans [29] [30]. This can be applied in SMART-In English, an interactive application that supports and simplifies the process of learning vocabulary pronunciation in English. Users can be more interested, application and optimize the learning process, as well as provide opportunities for users to be more confident in learning as well as broaden the horizons. Good pronunciation of English is a must, because if the vocab and grammar are correct, but the other person does not understand the message then communication will not go on. Difficulty pronouncing according to the way native speakers say a word or phrase, often making a learner feel inhibited and reducing his confidence when engaging in an English conversation, needs repetitive and frequent exercises to achieve maximum results.
From the analysis, SMART-In English is developed with the concept of a single user, as for an architecture system which can be described is as in Figure 2. Development of applications from small scale to enterprise scale from time to time will face challenges. These challenges include: challenges to improve productivity development, challenges in responding to growing demands, challenges to maintain the sustainability of the value of an ongoing information system, and challenges to maintain system security. For this reason, the role of software architecture is very important to resolve these issues.
Software architecture describes the structure of the system such as: (1) Software elements portrayed as abstractions and in the form of system modules or high-level components; (2) External Visible Properties of elements that describe the features of the elements being exposed and represent the services provided to other elements; and (3) Relationships of elements that describe the way elements interact with each other. Figure 2 is a general system architecture, users can record the voice in English, then Android capture and transmit voice over the internet then process it via the Cloud Speech API in the Google Cloud Platform, if the sound is identified then will respond and sends back to the user, SMART-In English will display the results of the spoken pronunciation.

A. SMART-In English Development
Use case diagram, as illustrated in Figure 3 is a path of an app in which the player is described as the user and when the player pushes the button, then there are several "class" consisting of 4 class and when the player presses the button of about the content then display of the app.

B. Implementation of SMART-In English
The results of the analysis, design, and manufacture of the application generated SMART-In English as a medium of learning English pronunciation using speech recognition that utilizes Cloud Speech API.

1) Structure Navigation
There are menus ( Figure 5) in SMART-In English, created to meet the needs of learning English pronunciation. The menu is designed based on the object category that will be displayed on SMART-In English. The menu displays several functions that can be used by users in the application.

2) Menus
When the user opens the SMART-In English application, the user will go to the main page of Main Menu ( Figure 6, (a)) and Play Menu (Figure 6 (b)), there is a choice of learning material, i.e. Animals, Body's, Objects and Fruits.

3) Usage
In the use of SMART-In English application users simply select the menu of Animals, Body, Objects, and Fruits. Figure  7 there are categories and displaying the correct way of pronunciation, (a) showing the way of pronunciation \ dog \; (b) shows how to pronounce \ laptop \; (c) how to pronunciation \ lemon \. SMART-In English will show whether the pronunciation is true or false and will display the score of pronunciation being pronounced (Figure 8). Then the app will display the correct pronunciation and give the value (score) similarity if the user pronounces the word wrong ( Figure 9). The message if the word is spoken wrong will appear, i.e. "Jawaban Anda Salah"

C. Evaluate and Testing
The main purpose of software testing is actually simple, namely to ensure that the software produced matches the requirements that were previously determined. When the requirements of a system have been compiled then there should be a test planning. In addition, a testing process requires a final goal that can be assessed so that the tester can stop doing a test when those goals are achieved.
This stage is very important, because at this stage has the main goal is to ensure the functions of the components of the system has functioned in accordance with the expected and in accordance with the concept. There are 2 stages to do that is testing the application side and user acceptance testing of the application.

1) Testing Application
In this stage testing done using black box testing method. This black box method is a test program based on the function of the program. The purpose of this method of black box testing is to find function errors in the program, Table I and  Table II.

2) User Acceptance
From the results of acceptance testing to the user, shows the final results of testing has a performance in accordance with expectations and feasible to be used as an alternative media in learning English pronunciation.

IV. CONCLUSION AND FUTURE STUDY
SMART-In English has been successfully implemented in utilizing the cloud speech API for the development of English pronunciation learning media using Speech Recognition technology. SMART-In English is still limited to using one word in English, hoping that in subsequent research it can be better developed for word combinations or use of English sentences. In addition, the addition of vocabulary is also important to be a concern in the development of this application.