November 17, 2015

Integrating Selenium into your Python Automation Project

1. Selenium 101

What is Selenium?

Selenium is an automation tool for browser behavior. It automates browsers.

Interactions with Selenium can be done mainly through two different access points:

  1. Selenium IDE (Integrated Development Environment) is a prototyping tool for building test scripts. It is a Firefox plugin and provides an easy-to-use interface for developing automated tests. Selenium IDE has a recording feature, which records user actions as they are performed and then exports them as a reusable script in one of many programming languages that can later be executed.
  2. Selenium WebDriver makes direct calls to the browser using each browser's native support for automation. The browser chosen determines how these direct calls are made and the features they support. You can use the Selenium WebDriver APO implementation in any specific browser for integration in your specific automation project programming language and structure.

1.a. The Selenium IDE

The Selenium IDE is available only for Firefox. It is an easy-to-use Firefox plugin and is generally the most efficient way to develop test case scenarios. Its intended purpose is mainly easy automation for people who don't use or don't want to use programming languages.

However, it doesn't give you access to all of the power of the Selenium implementation. If you want all of the power that's under the hood, you should use Selenium WebDriver in your favorite programming language.


  1. Using Firefox, first download the IDE from the SeleniumHQ downloads page
  2. Select "Install Now"
  3. Restart Firefox
  4. After the restart, you will be presented with the following menu option:

The IDE main window looks like this:

IDE Features:

  1. Creating, editing, saving new/existing test case scenarios
  2. Recording scenarios
  3. Executing scenarios, speed execution control
  4. "Debugging" scenarios

The Toolbar:

record : Record manually performed user actions that occurred inside of the browser

execution_speed_control1 execution_speed_control2 : Execution speed control

Execution control:

run_all_scenarios : Run all scenarios

run_current_scenario : Run current selected scenario


step : Step to run one command at a time

pause/resume : Pause/Resume execution

1.b. The Selenium WebDriver

The Selenium WebDriver makes direct calls to the browser using each browser's native support for automation. There is a specific driver available for each of the supported browsers, mainly Firefox, Internet Explorer, and Chrome, though it also includes Opera.

It provides an API that can be used for easy integration into your automation project. Its main intended purpose is automation and it gives you complete power over all of the capabilities of the Selenium WebDriver implementation.

2. Integrating the Python Selenium Bindings Into Your Project

The Python bindings for Selenium provide easy access to all of the functionalities offered by Selenium WebDrive from Python.

The current supported Python version are 2.7, 3.2, 3.3, and 3.4.

The easiest way to get the Selenium Python binding is through pip: C:\Python27\Scripts\pip install selenium

Now you can use the WebDriver to created automated test cases in Python. Just pop a .py file on your driver and from Selenium, import WebDriver.

After this, you can access different classes in the API:


The Firefox WebDriver is available by default. If you want to use another browser, like Chrome, you need to download the appropriate driver from the driver maintainer (in this case, the Chromium project, which is available here) and add it to a known location. For example, include the path to ChromeDriver when instantiating webdriver.Chrome in your PATH environment variable.

If you don't want to include the driver path in your PATH, you need to specify the path to the driver when instantiating the driver:

driver = webdriver.Chrome('/path/to/chromedriver')  # Optional argument, if not specified will search path.

Example using Firefox

Let's see if you can search on Google if you type something in the search box and press ENTER. I know this seems like a stupid example, but let's use it to keep things simple.

We will create two Python files:

  • One will contain a helper class that wraps Selenium up for us in nice methods
  • The other will inherit the Python unit test framework and run out a test case

from selenium import webdriver
from import WebDriverWait
from import expected_conditions as EC
from import Select
from selenium.webdriver.common.keys import Keys
from import By

import time

class SeleniumHelper():    
    def initializeWebdriver(self, driverType, implicitWait=10):
   	 #initialize appropriate driver, remember to download driver and add it to you PATH environment variable
   	 if driverType == 'Firefox':
   		 self.driver = webdriver.Firefox()
   	 elif driverType == 'Chrome':
   		 self.driver = webdriver.Chrome()
   	 elif driverType == 'IE':
   		 self.driver = webdriver.Ie()
   		 raise Exception('Unknown webdriver type')
   	 #for every action do a retry with a 10 seconds timeout by default or user specified
    def navigateToUrl(self, url):
   		 #signal something was wrong, or handle the exception appropriately here according to your needs
    def closeBrowser(self):
   		 #signal something was wrong, or handle the exception appropriately here according to your needs
    def search(self, searchString):
   		 #find search field in webpage using XPATH
   		 input = self.driver.find_elements_by_xpath('//input[@type="text" and @name="q"]')[0]
   		 #check if input is accessible to the user
   		 if not input.is_displayed():
   			 raise Exception("Search field not found.")
   		 #clear contents of search field
   		 #type in query followed by the ENTER key to trigger the search
   		 input.send_keys(searchString + Keys.RETURN)
   		 #signal something was wrong, or handle the exception appropriately here according to your needs

    def waitForPageTitle(self, expectedTitle, timeout = 30):
   	 while timeout > 0:
   		 timeout = timeout - 1
   		 if expectedTitle in self.driver.title:
   	 if timeout <= 0 and (expectedTitle not in self.driver.title):
   		 return False
   	 return True


import unittest
from selenium_helper import SeleniumHelper

class TestGoogleSearch(unittest.TestCase):

  def test_search_field(self):
   	 SeleniumHelperInstance.navigateToUrl("")"particle detector")
   	 if not SeleniumHelperInstance.waitForPageTitle("particle detector"):
   		 #first close the browser
   		 SeleniumHelperInstance.closeBrowser()"Search didn't work.")
if __name__ == '__main__':
    SeleniumHelperInstance = SeleniumHelper()

Now let's install the Selenium Python binding and run it. Assuming you have Python 2.7 installed, the following command should be issued:

We placed the above files on C:/selenium_sample on our machine. Now we can issue the following:

As you will see on your machine, Firefox will start up and SeleniumHelper will execute the navigation, input the text into the search field, and wait for the page title to change. All of that is found inside the file.

Let's run it for Chrome. Open and replace the "Firefox" string with "Chrome" and then run it again.

Oops! We need to get the driver for Chrome. Well, Selenium is nice and gives us a link to it. We will download the, but you can download the one suitable for your platform. Unpacking the archive leaves me with a file called chromedriver.exe. If you want to be quick, just copy it next to our Python file and run the Voila! Chrome opens and runs the test case. It's as easy as that.

Updating is also easy--just overwrite the file with the new version. Want everyone to work with the same version? Push it into your repository and let the other people pull it to their machines or the test machine.

3. XPaths

The element selection from the UI was done using XPath location. XPath enables the usage of path notation - like in URLS - to navigate inside the structure of the webpage. An element can be located either by using the absolute path or a relative path. The absolute path has to include all of the hierarchy above the selected element, which is prone to instability when the webpage is changed, as was the base with our "baby" web interface. So, naturally, a relative path was chosen. In order to select the needed element, a combination of XPath operators and attributes has to be used.

Getting the XPath to use in Selenium selection:

In order to find the XPath, we used the Firebug extension for Firefox (you can install the latest version from here). After installation, you will see if appear in the Firefox toolbar:

Once you click the Firebug icon, the mighty interface will appear:

Now let's see how you can get the XPath for an element. Press the "Element Inspect" button (indicated by the red arrow in the image above) and select one of the elements in the page. In our example, we want to select the "Play" button on an embedded Youtube video, so we just click on it. When you select it, Firebug will instantly select the HTML code of the element, as seen in this image:

You can see the blue border around the selected element and corresponding HTML code.

This is the selected code:

<button class="ytp-large-play-button ytp-button" tabindex="23" aria-live="assertive" style="" aria-label="Watch Introducing Invincea Advanced Endpoint Protection"><svg width="100%" viewBox="0 0 68 48" version="1.1" height="100%"><path fill-opacity="0.9" fill="#1f1f1e" d="m .66,37.62 c 0,0 .66,4.70 2.70,6.77 2.58,2.71 5.98,2.63 7.49,2.91 5.43,.52 23.10,.68 23.12,.68 .00,-1.3e-5 14.29,-0.02 23.81,-0.71 1.32,-0.15 4.22,-0.17 6.81,-2.89 2.03,-2.07 2.70,-6.77 2.70,-6.77 0,0 .67,-5.52 .67,-11.04 l 0,-5.17 c 0,-5.52 -0.67,-11.04 -0.67,-11.04 0,0 -0.66,-4.70 -2.70,-6.77 C 62.03,.86 59.13,.84 57.80,.69 48.28,0 34.00,0 34.00,0 33.97,0 19.69,0 10.18,.69 8.85,.84 5.95,.86 3.36,3.58 1.32,5.65 .66,10.35 .66,10.35 c 0,0 -0.55,4.50 -0.66,9.45 l 0,8.36 c .10,4.94 .66,9.45 .66,9.45 z" class="ytp-large-play-button-bg"/><path fill="#fff" d="m 26.96,13.67 18.37,9.62 -18.37,9.55 -0.00,-19.17 z"/><path fill="#ccc" d="M 45.02,23.46 45.32,23.28 26.96,13.67 43.32,24.34 45.02,23.46 z"/></svg></button>

The bold part of the code is what we're interested in. Let's see the XPaths we can use:

  1. driver.get_element_by_xpath(“//button[@class=’ytp-large-play-button ytp-button’]”)
    This is the simplest way to select the button, but you have to be careful so that no other element in the page as the same @class. IF there are other elements with the same class, the first one in descending order will be returned. You might get lucky and the first one could be the one you're looking for, meaning that when you run the test it's going to work. However, as soon as another element with the same class is introduced before this one, the script will no longer return the intended one. So let's make it a little more direct.
  2. driver.get_element_by_xpath(“//button[@class=’ytp-large-play-button ytp-button’ and @aria-label=’Watch Introducing Invincea Advanced Endpoint Protection’]”)
    Notice the added @aria-label? We have one extra property with which we make even more sure that the returned element is the button we expected.
  3. driver.get_element_by_xpath(“/html/body/div/div[2]/div[3]/article/div/div[3]/div[2]/div/div[5]/iframe/html/body/div/div/div/button”)
    See how easy it is to use absolute path? Remember when we said at the beginning that we didn't use this because it's prone to instability? Well, it's also very long! And imagine adding one element in the page somewhere before this button. You'll have to rewrite all of the XPath.

Notice the way in which, in the first two examples, at the beginning of the XPath we used "//" and in the third one we used "/"? The difference between them is that "/" starts the selection from the root node, while "//" starts the selection from the first node that matches the XPath. You can also use "." to select the current node and ".." to select the parent of the current node. Play around using all of the attributes in the HTML code using the @ selector in front of them (e.g. @type, @data, @class, or @id).

And there are more functions! There are built-in functions for numeric, string, and boolean operations, for context and sequence evaluation, error and trace, and many other things. You name it and it's there. For example, you can select the "Save" button in a webpage by using driver.find_elements_by_xpath( "//button[contains(text(),\"Save\")]")

How about not only selecting the button but also clicking it? Selenium allows us to do this by adding .click to the end of the element selection code. So the above example will become driver.find_elements_by_xpath( "//button[contains(text(),\"Save\")]").click Beautiful, no?

There are a few more ways to use XPath, which are easy to figure out on your own.

4. Remarks

The choice for Python and Selenium came after some time dedicated to research. Python was the obvious language choice because it was already used in the automation of sister Windows application tests. The question was what test tool to use for the interaction with the web interface. The choice came down to Selenium and Sikuli. Selenium won because the web interface we're using is still in its starting period, meaning it's subject to many UI changes against which Selenium is more resilient.

Thanks for your time and I hope all of this will be of help!