Understanding Selenium and WebDriver: A Complete Introduction

Understanding Selenium and WebDriver: A Complete Introduction

What is Selenium ?

Selenium is an open-source automation testing tool used to automate web applications it is free to download from the official Selenium website, making it a cost-effective solution compared to tools like QTP (QuickTest Professional), which require expensive licenses.

Why Selenium?

  • Cost Effective: Selenium is free, which makes it highly popular in the automation testing industry.

  • Web based Applications: Selenium is exclusively designed for web based applications. Any application that can render on a browser can be automated using Selenium.

  • Cross-Browser Compatibility: It supports multiple browsers like Chrome, Firefox, Internet Explorer and Safari. The same code can run across different browsers by simply specifying the browser name.

  • Cross-Platform Support: Selenium works on Windows, macOS, and Linux, making it a versatile tool for different environments.

  • Multi-Language Support: Test cases can be written in various programming languages like Java, Python, C#, JavaScript, Ruby, and PHP, allowing testers to choose their preferred language.

Key Components of Selenium

  • Selenium WebDriver: A robust tool for browser-based automation. It is the most widely used tool in the Selenium family and allows users to write customized scripts for regression testing.

  • Selenium IDE: A record-and-playback tool designed for beginners or users without coding knowledge. However, it is not used in real-world projects due to its limited functionality.

  • Selenium Grid: A tool that enables parallel execution of tests across multiple environments. It uses a Hub-Node architecture, where Hub is the central server controlling test execution and Nodes are machines executing the tests.

Before Selenium, tools like QTP dominated the market, but they were costly. Selenium’s free and open-source nature allowed it to gain traction quickly, becoming the go-to tool for automation in the QA industry. Its flexibility, combined with the ability to automate across browsers and platforms, makes it a favorite among testers.

Why Choose Selenium WebDriver?

  • It is the latest and most powerful tool in the Selenium suite.

  • It supports modern browsers and simplifies automation by using single scripts across platforms and browsers.

  • It is used for building scalable and maintainable automation frameworks in Enterprise-level projects requiring complex automation.

Selenium WebDriver Architecture

Selenium WebDriver architecture defines how Selenium interacts with web browsers to automate tasks. It uses the W3C WebDriver protocol for standardized communication between your Selenium scripts, browser drivers, and web browsers.

Workflow:

Step 1: Writing the Test Case

  • Write your Selenium script in a code editor like Eclipse or VS Code, using a supported programming language (e.g., Java, Python).

Step 2: Code Conversion to JSON Format

  • When executed the Selenium script is converted into a standardized JSON format. This enables communication with the browser driver.

Step 3: Communication with the Browser Driver

  • The JSON request is sent to the browser driver (e.g., ChromeDriver, GeckoDriver, EdgeDriver) via HTTP.

  • The browser driver interprets the commands and acts as a bridge between the Selenium script and the browser.

Step 4: Execution on the Browser

  • The browser driver performs the actions on the web browser (e.g., clicking on the button or verifying text).

Step 5: Returning the Response

  • The browser sends the execution results back to the browser driver in JSON format.

  • The browser driver forwards this response to the test script.

Step 6: Displaying Results

  • The test results are displayed in your IDE, indicating a success or failure.

Key Roles in the Architecture

  1. Client (Selenium Script): Contains test steps written in a programming language.

  2. Browser Driver (Server): Acts as a mediator between the script and the browser.

    • Example: ChromeDriver, GeckoDriver, EdgeDriver.
  3. Browser: The web application being tested.

Importance of Browser Drivers

  • Every browser requires a specific driver:

    • Chrome: ChromeDriver

    • Firefox: GeckoDriver

    • Edge: EdgeDriver

  • Selenium Manager (introduced in Selenium 4.6) automates the process of downloading and managing the browser drivers. Manual configuration is no longer mandatory unless restricted by environment policies.

Mandatory Configuration (Optional with Selenium Manager)

Previously the path to the browser driver had to be explicitly set in the script (e.g., System.setProperty).

With Selenium Manager:

  • Automatically resolves browser driver versions and paths, simplifying setup.

  • Manual configuration is still possible for advanced use cases.

That’s it for now! See you in the next blog. Check out the complete series on Selenium for more such articles: The Selenium Guidebook

Did you find this article valuable?

Support Samiksha's Blog by becoming a sponsor. Any amount is appreciated!