Tutorial

Java SAX Parser Example

Published on August 4, 2022
author

Pankaj

Java SAX Parser Example

SAX Parser in java provides API to parse XML documents. SAX parser is different from DOM parser because it doesn’t load complete XML into memory and read xml document sequentially.

SAX Parser

sax parser, sax parser example, java sax parser javax.xml.parsers.SAXParser provides method to parse XML document using event handlers. This class implements XMLReader interface and provides overloaded versions of parse() methods to read XML document from File, InputStream, SAX InputSource and String URI. The actual parsing is done by the Handler class. We need to create our own handler class to parse the XML document. We need to implement org.xml.sax.ContentHandler interface to create our own handler classes. This interface contains callback methods that receive notification when an event occurs. For example StartDocument, EndDocument, StartElement, EndElement, CharacterData etc. org.xml.sax.helpers.DefaultHandler provides default implementation of ContentHandler interface and we can extend this class to create our own handler. It’s advisable to extend this class because we might need only a few of the methods to implement. Extending this class will keep our code cleaner and maintainable.

SAX parser Example

Let’s jump to the SAX parser example program now, I will explain different features in detail later on. employees.xml

<?xml version="1.0" encoding="UTF-8"?>
<Employees>
	<Employee id="1">
		<age>29</age>
		<name>Pankaj</name>
		<gender>Male</gender>
		<role>Java Developer</role>
	</Employee>
	<Employee id="2">
		<age>35</age>
		<name>Lisa</name>
		<gender>Female</gender>
		<role>CEO</role>
	</Employee>
	<Employee id="3">
		<age>40</age>
		<name>Tom</name>
		<gender>Male</gender>
		<role>Manager</role>
	</Employee>
	<Employee id="4">
		<age>25</age>
		<name>Meghna</name>
		<gender>Female</gender>
		<role>Manager</role>
	</Employee>
</Employees>

So we have a XML file stored somewhere in file system and by looking at it, we can conclude that it contains list of Employee. Every Employee has id attribute and fields age, name, gender and role. We will use SAX parser to parse this XML and create a list of Employee object. Here is the Employee object representing Employee element from XML.

package com.journaldev.xml;

public class Employee {
    private int id;
    private String name;
    private String gender;
    private int age;
    private String role;
    
    public int getId() {
        return id;
    }
    public void setId(int id) {
        this.id = id;
    }
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    public String getGender() {
        return gender;
    }
    public void setGender(String gender) {
        this.gender = gender;
    }
    public int getAge() {
        return age;
    }
    public void setAge(int age) {
        this.age = age;
    }
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    
    @Override
    public String toString() {
        return "Employee:: ID="+this.id+" Name=" + this.name + " Age=" + this.age + " Gender=" + this.gender +
                " Role=" + this.role;
    }
    
}

Let’s create our own SAX Parser Handler class extending DefaultHandler class.


package com.journaldev.xml.sax;

import java.util.ArrayList;
import java.util.List;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import com.journaldev.xml.Employee;

public class MyHandler extends DefaultHandler {

	// List to hold Employees object
	private List<Employee> empList = null;
	private Employee emp = null;
	private StringBuilder data = null;

	// getter method for employee list
	public List<Employee> getEmpList() {
		return empList;
	}

	boolean bAge = false;
	boolean bName = false;
	boolean bGender = false;
	boolean bRole = false;

	@Override
	public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {

		if (qName.equalsIgnoreCase("Employee")) {
			// create a new Employee and put it in Map
			String id = attributes.getValue("id");
			// initialize Employee object and set id attribute
			emp = new Employee();
			emp.setId(Integer.parseInt(id));
			// initialize list
			if (empList == null)
				empList = new ArrayList<>();
		} else if (qName.equalsIgnoreCase("name")) {
			// set boolean values for fields, will be used in setting Employee variables
			bName = true;
		} else if (qName.equalsIgnoreCase("age")) {
			bAge = true;
		} else if (qName.equalsIgnoreCase("gender")) {
			bGender = true;
		} else if (qName.equalsIgnoreCase("role")) {
			bRole = true;
		}
		// create the data container
		data = new StringBuilder();
	}

	@Override
	public void endElement(String uri, String localName, String qName) throws SAXException {
		if (bAge) {
			// age element, set Employee age
			emp.setAge(Integer.parseInt(data.toString()));
			bAge = false;
		} else if (bName) {
			emp.setName(data.toString());
			bName = false;
		} else if (bRole) {
			emp.setRole(data.toString());
			bRole = false;
		} else if (bGender) {
			emp.setGender(data.toString());
			bGender = false;
		}
		
		if (qName.equalsIgnoreCase("Employee")) {
			// add Employee object to list
			empList.add(emp);
		}
	}

	@Override
	public void characters(char ch[], int start, int length) throws SAXException {
		data.append(new String(ch, start, length));
	}
}

MyHandler contains the list of the Employee object as a field with a getter method only. The Employee objects are getting added in the event handler methods. Also, we have an Employee field that will be used to create an Employee object and once all the fields are set, add it to the employee list.

SAX parser methods to override

The important methods to override are startElement(), endElement() and characters(). SAXParser starts parsing the document, when any start element is found, startElement() method is called. We are overriding this method to set boolean variables that will be used to identify the element. We are also using this method to create a new Employee object every time Employee start element is found. Check how id attribute is read here to set the Employee Object id field. characters() method is called when character data is found by SAXParser inside an element. Note that SAX parser may divide the data into multiple chunks and call characters() method multiple times (Read ContentHandler class characters() method documentation). That’s why we are using StringBuilder to keep this data using append() method. The endElement() is the place where we use the StringBuilder data to set employee object properties and add Employee object to the list whenever we found Employee end element tag. Below is the test program that uses MyHandler to parse above XML to list of Employee objects.

package com.journaldev.xml.sax;

import java.io.File;
import java.io.IOException;
import java.util.List;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.SAXException;

import com.journaldev.xml.Employee;

public class XMLParserSAX {

    public static void main(String[] args) {
    SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
    try {
        SAXParser saxParser = saxParserFactory.newSAXParser();
        MyHandler handler = new MyHandler();
        saxParser.parse(new File("/Users/pankaj/employees.xml"), handler);
        //Get Employees list
        List<Employee> empList = handler.getEmpList();
        //print employee information
        for(Employee emp : empList)
            System.out.println(emp);
    } catch (ParserConfigurationException | SAXException | IOException e) {
        e.printStackTrace();
    }
    }

}

Here is the output of the above program.

Employee:: ID=1 Name=Pankaj Age=29 Gender=Male Role=Java Developer
Employee:: ID=2 Name=Lisa Age=35 Gender=Female Role=CEO
Employee:: ID=3 Name=Tom Age=40 Gender=Male Role=Manager
Employee:: ID=4 Name=Meghna Age=25 Gender=Female Role=Manager

SAXParserFactory provides factory methods to get the SAXParser instance. We are passing File object to the parse method along with MyHandler instance to handle the callback events. SAXParser is a little bit confusing in the start but if you are working on a large XML document, it provides a more efficient way to read XML than DOM Parser. That’s all for SAX Parser in Java.

You can download the project from our GitHub Repository.

Reference: SAXParser, DefaultHandler

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the authors
Default avatar
Pankaj

author

While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
JournalDev
DigitalOcean Employee
DigitalOcean Employee badge
August 30, 2013

Hi Pankaj, i appreciate your contributions. pls i am writing a GUI calculator program. When i click on a number, it displays in the textfield, but, the moment i click on another, it overrides the previous. how do i go about this?

- Jide

    JournalDev
    DigitalOcean Employee
    DigitalOcean Employee badge
    September 19, 2013

    Just append the label of button with textbox content and set It again. String num=t1.getText()+btn.getLabel(); t1.setText(num); something like this

    - Varsha

      JournalDev
      DigitalOcean Employee
      DigitalOcean Employee badge
      November 1, 2013

      Thank you very much, very clear explanations. I’ve been trying for several hours to wrap my head around the concept and usage of the SAX API and you made it easy for me. Thank you again!

      - Andres

        JournalDev
        DigitalOcean Employee
        DigitalOcean Employee badge
        November 9, 2013

        Hi Pankaj, Is the response data Should be in request.Is there any chance for hidden… Means already stored data we need with some reference field. like through Employee Id, we need to get Employee details(already Employee Details are stored)… How it is???

        - pallavi

          JournalDev
          DigitalOcean Employee
          DigitalOcean Employee badge
          November 22, 2013

          Hi, For example using startelement I’m passing as but in the endelement I should get only as not as .I have used ContentHandler for this. How can we implement to have different name in header and footer. Any help can be greatly appreciated!

          - Ariyur

            JournalDev
            DigitalOcean Employee
            DigitalOcean Employee badge
            February 18, 2014

            HI, SIR… its fine for single xml document, if in case we are using more than one xml document how it will differentiate and store. ex: you are taken employee xml document, i am inserting student.xml doc and book.xml document in that how it will identify individually and store. if we enter pankaj how it will identify whether it is student name or employee name or author name.

            - ashwini

              JournalDev
              DigitalOcean Employee
              DigitalOcean Employee badge
              May 18, 2014

              Hi, when running the program I get the following output: pojos.Employee@1a3ca10 pojos.Employee@26f9e5 pojos.Employee@e06703 pojos.Employee@8b1a4f rename packages what is the problem?

              - Emmanuel

                JournalDev
                DigitalOcean Employee
                DigitalOcean Employee badge
                June 3, 2014

                How to parse empty tags? I mean example is cool but very simple. I can’t figure out how to parse short notation tags: or empty ones but lik …this is correct xml, but according to code above, next node will be treated as content of it…

                - Ellen

                  JournalDev
                  DigitalOcean Employee
                  DigitalOcean Employee badge
                  June 12, 2014

                  Hi Pankaj, I have no coding experience and I’m trying to parse an xml file. I used your code and modified it. When I run it, i get this: Exception in thread “main” java.lang.NullPointerException at edu.illinois.lis.hakimra2.myproject.MyHandler.characters(MyHandler.java:72) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.characters(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(Unknown Source) at javax.xml.parsers.SAXParser.parse(Unknown Source) at edu.illinois.lis.hakimra2.myproject.XMLParserSAX.main(XMLParserSAX.java:19) Do you know what might the problem be? Also, I want to store my data in a csv file. Can you tell me how to do it? Thanks

                  - Dalal

                    JournalDev
                    DigitalOcean Employee
                    DigitalOcean Employee badge
                    July 9, 2014

                    Can you help me, how to do sax parsing for be low xml 1 PROD=DPP;PLC=IOYR2013NSSN 1003N.3::DPP::0 94 94 15 15 94 11.6 12.34 15.95 10.5 CF=1;PLC=IOYR2013NSSN;M_ITEMSELL:MFY.=93.85;M_SALESTAX:TFN.=7.37;M_SHIPPING:SFYS=7.78;M_SHIPSUR:SFYS=0;P_ITEMSELL:MFN.=94;P_SHIPSUR:SFNS=0;P_TSH:SFNS=15;A_ITEM:PAX=94;A_ITEM:USD=94;A_SHIP:PAX=15;A_SHIP:USD=15;D_PV:PAX=15.95;D_BV:PAX=10.5 94 94 15 15 94 11.6 12.34 15.95 10.5 CF=1;PLC=IOYR2013NSSN;M_ITEMSELL:MFY.=93.85;M_SALESTAX:TFN.=7.37;M_SHIPPING:SFYS=7.78;M_SHIPSUR:SFYS=0;P_ITEMSELL:MFN.=94;P_SHIPSUR:SFNS=0;P_TSH:SFNS=15;A_ITEM:PAX=94;A_ITEM:USD=94;A_SHIP:PAX=15;A_SHIP:USD=15;D_PV:PAX=15.95;D_BV:PAX=10.5

                    - ramesh

                      Try DigitalOcean for free

                      Click below to sign up and get $200 of credit to try our products over 60 days!

                      Sign up

                      Join the Tech Talk
                      Success! Thank you! Please check your email for further details.

                      Please complete your information!

                      Become a contributor for community

                      Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

                      DigitalOcean Documentation

                      Full documentation for every DigitalOcean product.

                      Resources for startups and SMBs

                      The Wave has everything you need to know about building a business, from raising funding to marketing your product.

                      Get our newsletter

                      Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.

                      New accounts only. By submitting your email you agree to our Privacy Policy

                      The developer cloud

                      Scale up as you grow — whether you're running one virtual machine or ten thousand.

                      Get started for free

                      Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

                      *This promotional offer applies to new accounts only.