Recently, while automating a functional test-case, I met with a requirement where, for every transaction, I had to verify the content of an email sent to users. Coming from a UI automation background, I initially thought of automating the Gmail Web UI using Selenium WebDriver.
However, after more searching, I found that Gmail’s UI is developed in Google Web ToolKit (GWT), which is highly dynamic and generates random IDs to all of the DOM elements in every session. After discovering this, I decided to give up this approach, because it would be very difficult to handle these dynamic IDs in our WebDriver code. I then began looking into alternative approaches.
After more research, I found out that in 2014 Google released their official Gmail APIs, which provide a RESTful interface to both read Gmail mailboxes and send emails. I decided to use this approach because it was much more robust and it allows users to have more control over their Gmail inbox.
This implementation required me to use Ruby, which has specific client bindings for these APIs. To get started with the Ruby bindings, we need to install the official gem developed by the Google team for this command.
Google uses OAuth 2.0 for authenticating these APIs, so first we need to understand how to set up these OAuth credentials. I have outlined detailed steps to set the OAuth in my Github repository under the “Readme” file. After setting the OAuth credentials, we then need to download the client secret JSON file and refer to it in our Ruby code.
The way Google has designed the authentication process for their Gmail API is tricky. It requires a one-time manual intervention where a user must first log in to it the browser, allowing the access token to be saved in a file and re-used for future use.
Here is the sample code for authentication:
Once we are done with the authentication, we need to explore the different Gmail APIs in detail to understand how to perform different operations with them. To do this, we can use any of Google’s API explorers, an example being one that allows us to reach APIs from the browser and see the response.
In my implementation, I developed the code–which can be seen here–for searching, reading, and deleting emails on the basis of a search query.
Here is the sample code for reading emails:
Here is the sample code for deleting emails:
These APIs are very powerful and will allow for many advanced operations like working with attachments, reading multi-part emails, working with labels, and working with drafts, among others.
However, if we only want to read the emails, then there is a simpler approach based on IMAP and SMTP that does not require Google APIs. Here is a very good implementation of this approach in Ruby based on IMAP. This implementation can also be used for what I have outlined above.