Covid-19 Notification System using AWS Cloud, Alexa Skill Kit and Time Series Forecasting with Facebook Prophet.

Due to this pandemic, we are all stuck in the same storm but on different boats. As we know, “prevention is better than cure” and any small step towards prevention can make a huge difference.

I wanted to build something that sends me an alert message with the daily new Covid-19 cases around me in my State, County and Zip Code as an SMS to my mobile phone.

Being a Cloud Engineer, I couldn’t find a better way than to leverage the power of cloud computing by incorporating multiple services and making them work together to build this system on AWS Cloud.

I believe that any application with voice interaction makes the application more user friendly. With this thought I developed an Alexa skill to respond to the user based on their request.

After collecting the Covid-19 historical data of the last 3 months I forecasted the daily new Covid-19 cases for 10 days into the future using Facebook Prophet.

Architecture diagram for this system:

Design of Covid-19 Notification System

These are the three open source api’s with JSON data used for this project where Covid-19 data is updated daily:

State wise daily new cases and other Covid-19 data in USA: https://covidtracking.com/api/states/daily

County wise daily new cases and other Covid-19 data in USA: https://covid19-us-api.herokuapp.com/county

Zip code wise daily new cases and other Covid-19 data in St. Louis County, Missouri in USA: https://data-stlcogis.opendata.arcgis.com/datasets/covid-19-zip-code-data?selectedAttribute=zip_5

check this out for more information on API’s https://covidtracking.com/data/api

I’ve used different AWS services to implement the system shown in the architecture above.

Firstly, the required infrastructure on AWS was built by using AWS SAM template. AWS SAM is an extension of AWS CloudFormation with all CloudFormation capabilities, SAM also transforms and expands the SAM template into AWS CloudFormation template, enabling us to build serverless applications faster.

AWS Lambda, SNS, CloudWatch Events and CloudWatch logs are also used.

I’ve used AWS Lambda, because we need not to manage any servers and we can invoke the lambda function whenever we want to and get the data from the respective API’s.

Basically, AWS Lambda is a service that is running on an Amazon Linux server which is maintained by AWS and not by the user. That’s the reason AWS Lambda is called a Serverless service. We have access to the path “/tmp” of the Amazon linux server and we can use this path in the Lambda function to store temporary files.

CSV file in S3 bucket has four columns namely State, County, ZipCode and Date.

Every time when the Lambda function is invoked, the CSV file(cases.csv) will be downloaded from the S3 to the “/tmp” path in the Lambda function for necessary operations to be performed on the file.

Our Python code has different functions to get the daily new Covid-19 cases for State, County and ZipCode from the JSON data. These functions are called inside the lambda handler and the data collected will be appended to the downloaded CSV file at “/tmp/cases.csv”.

This function gets the daily new cases in the State.

we have similar functions written for County and ZipCode.

Initial few lines of lambda_handler

Once the data is appended to the CSV file the file will be uploaded back to the S3 bucket and number of cases(state, county & zip) are stored as variables in order to send them to user by SNS.

We have an SNS topic created in the SAM template with the endpoint as the user’s mobile number.

Lambda function uses this SNS topic to send the message.

As per the user’s wish, considering the total population in the area, some threshold values are set for number of daily new cases in State, County and ZipCode as shown below.

For example, there are around 6 million residents in the state of Missouri, around 1 million residents in the St. Louis County and around 29,000 residents in the ZipCode(~4 mile radius).

Lambda will compare these threshold values and daily new cases, then the user will get a customized message based on what threshold level or zone the number of cases fall under.

Let’s consider that there are 1450 new Covid-19 cases today in Missouri state. So the state of Missouri will be considered as a “Danger_zone” because the threshold level for Danger_zone is 1400 and based on that the user will receive a message similar to the one shown in the screen shot.

If the number of cases in the state of Missouri are more than 700, then it is considered as a“Moderate_zone”.

Cloud watch Events are used to set the schedule or cron rules to trigger the Lambda function daily at the specified time. In our case it is set to 4:15 UTC which is 11:15 CST as shown below.

As we know, all the Amazon echo devices are powered by the “Alexa” cloud-based voice service. Using Alexa we can control our home appliances, check weather forecast, find ETA to the destination, find a restaurant nearby and it has many more built-in features.

Apart from the built-in features, we can also create our own custom skills using Alexa Skill Kit.

As we have the number of daily new Covid-19 cases data for State, County and ZipCode in the S3 bucket we create our own Alexa skill called “Covid notification system” to retrieve the data based on the user’s request.

Steps to implement Alexa skill:

  1. Goal of the skill: Responding to the user’s request by retrieving the data from S3 based on the request.
  2. Invocation: For every skill, there should be a unique name to identify our skill from other custom skills. In our case the name of our skill is “Covid notification system”. We can invoke our skill by using open/launch/begin skill name(“covid notification system”) and once the skill opens we can give our command or we can also input “Covid notification system” and give command in one shot as shown in the example. For example: we can say “Alexa, ask Covid notification system to tell the surge in Covid-19 cases in my Zip code”
  3. Request: As shown in the above example, once the user passes the command, this request will be formed into a JSON type request and will be send to the endpoint. Here, we developed a Lambda function as endpoint to take the JSON request from Alexa.
  4. Response: Now, Lambda function will form a similar JSON response using the data in CSV file from S3 bucket and send it back to Alexa.
Alexa developer console

We add intents to the skill, intents are like features to the skill, for example we have three columns in our cases.csv file. We can have different intents for each column namely State, County and ZipCode or any other intents to perform different individual operations based on the user’s choice of intent. As shown below we have created intents like MissouriIntent, SaintLouisCountyIntent and others.

Intent section on Alexa developer console

User can use different sentences to pass the commands, based on those, intents are invoked. Some of the potential sentences are pre-assigned to Alexa to be invoked based on the sentence used by the user. These are called as Sample Utterances. We will add various sample utterances expected from the user.

For every intent, we assign certain number of sample utterances which differentiates the intents and its operations. As shown below, in the MissouriIntent we have some sample utterances defined like “give me the covid cases in {MO} for last {number} days”. Whenever this or any of the Sample utterances defined below are used by the user, MissouriIntent will be invoked.

Missouri Intent

We can use variables in the sample utterances as shown above in the curly brackets. These are called Slots. For example in place of {MO}, user can use MO, Missouri, State, mystate or any thing that we can predefine in the slot values as shown below; that we expect from the user. Here is the slot “statename” below with the multiple values like MO, missouri state, state of missouri etc.

These slots can be used in any sample utterances for any intents in this skill.

Similarly, we can have multiple intents and sample utterances for different operations.

Slot section on Alexa developer console

Now, we need to select the endpoint. We can use AWS Lambda or Web service as an endpoint. In our case, we are using AWS Lambda as an end point. Select Lambda and add the ARN of the Lambda function.

Endpoint section on Alexa developer console

Copy the skill ID and use it in the AWS console for Lambda to trigger the Lambda function by Alexa skill kit from this specific skill ID as shown below.

AWS Lambda service on AWS Console

Our custom skill is registered on the Alexa developer console. Now, we will see how Alexa uses this information and sends the user’s request to the endpoint.

Once the user passes the command to Alexa, Alexa converts the speech to text and compares the command with all the Sample utterances and if anything matches, it invokes the respective intent for which the sample utterance is attached.

Alexa will convert this request to JSON type and send it to the endpoint(AWS Lambda). AWS Lambda is responsible for responding to this request in JSON format. Alexa converts back the JSON response from text to speech and gives it to the echo device/user.

As shown below, when the skill is invoked the JSON input will be sent to Lambda and the JSON output will be received.

Testing section on Alexa developer console

We have three types of requests in Alexa skill development:

  1. Launch Request: This request is called when the skill is invoked. For example: “open {Skill name}”. We need to respond with a welcome message for the launch request as shown in the above screenshot.
  2. IntentRequest: After opening the skill, whatever command the user will pass, IntentRequest will be called. Lambda should form a suitable response as per intent name in the request and send it back to Alexa.
  3. SessionEndedRequest: This request is called when user doesn’t say anything or there is some error in the code.

Lambda will call the respective functions based on the request as shown below.

There is a particular JSON format that Alexa uses to send the request and to accept the response in turn. This format has to be followed while developing the Lambda code. Refer to this.

For Example: When an intent request is made for MissouriIntent, that specific function will be called and the response will be prepared based on that. In the below screenshot, the response with the number of new Covid-19 cases in Missouri has been received.

Below is the code snippet for how the MissouriIntent is handled. Every intent has its own function written and the response is formed accordingly.

When the request type is IntentRequest, on_intent_request() function will be called and confirms the intent_name as MissouriIntent and then the function handle_missouri_intent() will be called to form the response.

Response is prepared as per the intent and passed to build_response() function to form the response as per the format requirements.

Similarly, there are multiple intents written to respond based on the user’s request.

Time Series is a series of data ordered by time. Time Series Forecasting is an area of Machine Learning used to make future predictions by creating models that fit the historical data

Prophet is an open source model to forecast the time series data. Prophet easily detects the changes in trends and seasonality.

I’ve used Prophet to forecast the number of daily new Covid-19 cases for the next 10 days.

Here is the Prophet implementation for Missouri State data:

Load the data:

The Python libraries shown in the below screenshot are used for forecasting. “cases.csv” is the file with the number of Covid-19 cases for State, County and Zip code as shown. The data has been loaded into the DataFrame.

For Prophet, DataFrame must have a specific format: First column ds that is an exact copy of the date column and that needs to be of datetime data type, and the second column y that is an exact copy of the ‘State’ column in our CSV file.

I’ve split the data into training and testing data, training data is used to fit the Prophet model and testing data is used to evaluate the accuracy

Forecasting:

We already imported the fbprophet library into our Python notebook:

Instantiated the Prophet object:

Now, we are ready to fit a model on our historical data. DataFrame with training data is passed into the fit method on the Prophet object.

Prophet has a built-in helper function make_future_dataframe to create a DataFrame of future dates. The make_future_dataframe function lets us specify the frequency and the number of time units we’d like to forecast into the future. By default, the frequency is set to days. Since, I wanted the forecast for 10 days into the future, I’ve set the periods argument to 10.

We have predict method to make predictions for each row in the future DataFrame.

m = Prophet()
m.fit(train_data)
future=m.make_future_dataframe(periods=10)
prophet_pred = m.predict(future)

New DataFrame assigned to the prophet_pred variable contains the forecasted values for future dates under the column yhat, as well as lower and upper intervals and other components for the forecast.

Values under yhat column are the forecasted daily new Covid-19 cases in Missouri State for 10 days into the future

The forecast can be visualized using Prophet’s built-in plot helper function and the individual forecast components can be visualized using Prophet’s built-in plot_components

Now, only the predicted values are added to a DataFrame to plot the results and compare them with the testing data.

Comparing Testing data and predictions:

We can see the plot with the Prophet predictions in the orange line and the actual daily new covid-19 cases from testing data in blue line

Forecast Error:

A forecast error is the difference between the actual value and its forecast. Here, error does not mean a mistake, it means the unpredictable part of an observation.

forecast_error = expected_value — predicted_value

Here, Mean Absolute error, Mean Squared error and Root mean square error are calculated to measure the forecast performance.

The mean absolute error, or MAE, is the average of the forecast error values, where all of the forecast error values are absolute values.

mean_absolute_error = mean( abs(forecast_error) )

The mean squared error, or MSE, is the average of the squared forecast error values.

mean_squared_error = mean(forecast_error²)

Taking the square root of the MSE gives us the root mean squared error, or RMSE.

rmse = sqrt(mean_squared_error)

All the above measures are calculated using the scikit-learn library and errors calculated are shown in the above screenshot.

Similarly, I’ve performed the Time Series Forecasting for historical data of daily new Covid-19 cases on the County level and ZipCode level collected since the past 3 months.

Conclusion:

I strongly believe in learning things by doing. Every part of this project was fun and challenging, right from collecting the daily updated Covid-19 data to Forecasting the future Covid-19 cases. It was a great experience to learn things in depth like Lambda layers, “/tmp” path in Lambda and exploring new areas like Alexa Skill Development and Time Series Forecasting.

References:

https://developer.amazon.com/en-US/docs/alexa/custom-skills/request-and-response-json-reference.html

https://mode.com/example-gallery/forecasting_prophet_python_cookbook/

https://machinelearningmastery.com/time-series-forecasting-performance-measures-with-python/

Cloud Engineer | AWS Cloud, DevOps and Data Enthusiast!