Java Regular Expression to extract data from a Text

This tutorial shows you how to extract data from a Text using Regular expression.

Related article: Java Regular Expression Overview – Syntax

I. Problem

Assume that we already have a string that contains Weather data like this:

What we need is to store them in Java Objects using Regular Expression.

II. Practice

– First, we need a data model class to store Weather data for each item we get.


public class Weather {
	private String location;
	private int temperature;

	// constructor, getter/setter methods...
}

– Next, we have to make our own pattern to extract data (location, temperature) from input text.
There are many ways to do this, for more details for how to choose syntax for the string pattern, please read this article: Java Regular Expression Overview – Syntax.
In this case, we can create a pattern like this:


String pattern = "(Location:)(\\s*.+)(,)(.*Temperature:\\s+)(\\d+)";

You can see that we have 5 groups, the data we need are only in the 2nd group and 5th group.
So the Java code should be similar to:


Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(txt);

while (m.find()) {
	String location = m.group(2);
	String temperature = m.group(5);
}

III. Source code

Weather.java


public class Weather {
	private String location;
	private int temperature;

	public Weather(String location, int temperature) {
		this.location = location;
		this.temperature = temperature;
	}

	public String getLocation() {
		return location;
	}

	public void setLocation(String location) {
		this.location = location;
	}

	public int getTemperature() {
		return temperature;
	}

	public void setTemperature(int temperature) {
		this.temperature = temperature;
	}

	@Override
	public String toString() {
		return "Weather [location=" + location + ", temperature=" + temperature + "]";
	}
}

MainApp.java


import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class MainApp {

	public static void main(String[] args) {
		List weathers = new ArrayList();

		String txt = "Location: New York, Temperature: 10"
				+ "\nLocation: Nevada, Temperature: 22"
				+ "\nLocation: London, Temperature: 12";

		String pattern = "(Location:)(\\s*.+)(,)(.*Temperature:\\s+)(\\d+)";

		Pattern p = Pattern.compile(pattern);
		Matcher m = p.matcher(txt);

		while (m.find()) {

			System.out.println("found Location >> " + m.group(2));
			String location = m.group(2).trim();

			System.out.println("found Temperature >> " + m.group(5));
			String temperature = m.group(5);

			weathers.add(new Weather(location, Integer.parseInt(temperature)));
		}

		System.out.println(weathers);
	}
}

Run the code, the result shows in console window:


found Location >>  New York
found Temperature >> 10
found Location >>  Nevada
found Temperature >> 22
found Location >>  London
found Temperature >> 12
[Weather [location=New York, temperature=10], Weather [location=Nevada, temperature=22], Weather [location=London, temperature=12]]


By grokonez | February 7, 2017.

Last updated on May 4, 2021.



Related Posts


Got Something To Say:

Your email address will not be published. Required fields are marked *

*