Skip to content

Latest commit

 

History

History
199 lines (147 loc) · 6.9 KB

File metadata and controls

199 lines (147 loc) · 6.9 KB

String Manipulation with Regular Expressions (Regex)

1. What is Regex?

A Regular Expression (Regex) is a sequence of characters that defines a search pattern. In Java, it’s primarily used for:

  • Searching and matching text
  • Validating input (email, phone, etc.)
  • Replacing or splitting text dynamically

Java provides regex support through:

  • The java.util.regex package → Pattern class (compiled regex) → Matcher class (performs matching)

2. Basic Regex Workflow in Java

import java.util.regex.*;

public class RegexDemo {
    public static void main(String[] args) {
        String text = "I love Java 17!";
        Pattern pattern = Pattern.compile("Java");
        Matcher matcher = pattern.matcher(text);

        if (matcher.find()) {
            System.out.println("Match found at index: " + matcher.start());
        }
    }
}

Output:

Match found at index: 7

3. Useful Regex Methods

Method Description Example
Pattern.compile(regex) Compiles regex pattern Pattern p = Pattern.compile("\\d+");
matcher(CharSequence) Creates a matcher Matcher m = p.matcher("abc123");
find() Finds next match m.find()
start(), end() Get indices of match m.start(), m.end()
matches() Checks if entire string matches pattern "abc".matches("\\w+")
replaceAll(regex, replacement) Replace all occurrences "a1b2".replaceAll("\\d", "#")
split(regex) Split based on pattern "a,b;c".split("[,;]")

4. Regex in String Class

The String class directly supports regex-based operations — no Pattern or Matcher needed.

Method Description
matches(String regex) Checks if full string matches
replaceAll(String regex, String repl) Replace all matching substrings
replaceFirst(String regex, String repl) Replace only the first match
split(String regex) Splits string based on regex pattern

Example: Replace and Split

public class StringRegexMethods {
    public static void main(String[] args) {
        String text = "Java 8, Java 11, Java 17";

        // Replace all digits
        String replaced = text.replaceAll("\\d+", "XX");
        System.out.println(replaced);  // Java XX, Java XX, Java XX

        // Split by comma or space
        String[] words = text.split("[, ]+");
        for (String word : words)
            System.out.println(word);
    }
}

5. Common Regex Patterns for Placement Prep

Pattern Meaning Example Match
\\d Any digit (0–9) "5"
\\D Non-digit "A"
\\w Word character (a–z, A–Z, 0–9, _) "hello_123"
\\W Non-word character "#"
\\s Whitespace space, tab
\\S Non-whitespace "a"
. Any character (except newline) "x", "A"
^ Start of string ^Java matches "Java17"
$ End of string world$ matches "Hello world"
[abc] Any one of a, b, c "cab"
[^abc] Any char except a, b, c "xyz"
`(x y)` Either x or y "yes" or "no"
{n} Exactly n occurrences \\d{3} → "123"
+ One or more [a-z]+
* Zero or more [a-z]*
? Zero or one [a-z]?

6. Example: Validate Email & Phone

Validate Email

public class EmailValidator {
    public static void main(String[] args) {
        String email = "ben.tech@gmail.com";
        String regex = "^[\\w._%+-]+@[\\w.-]+\\.[a-zA-Z]{2,6}$";
        System.out.println(email.matches(regex));  // true
    }
}

Validate Phone Number

public class PhoneValidator {
    public static void main(String[] args) {
        String phone = "+91-9876543210";
        String regex = "^(\\+91[-\\s]?)?[0]?(91)?[6-9]\\d{9}$";
        System.out.println(phone.matches(regex));  // true
    }
}

7. Extracting Data from Text

import java.util.regex.*;

public class ExtractDemo {
    public static void main(String[] args) {
        String text = "Order IDs: 1234, 5678, and 91011.";
        Pattern p = Pattern.compile("\\d+");
        Matcher m = p.matcher(text);

        while (m.find()) {
            System.out.println("Found ID: " + m.group());
        }
    }
}

Output:

Found ID: 1234
Found ID: 5678
Found ID: 91011

Common Pattern Flags

Flag Description
Pattern.CASE_INSENSITIVE Ignores case
Pattern.MULTILINE ^ and $ match start and end of each line
Pattern.DOTALL . matches newline (\n) as well
Pattern.UNICODE_CASE Enables Unicode case-insensitive matching
Pattern.COMMENTS Allows spaces & comments in regex

Key Takeaways

  • Pattern → compiles regex once

  • Matcher → performs matching operations

  • Use raw string literals with double escapes ("\\d+")

  • Prefer precompiled patterns for repeated use

  • replaceAll() and split() support regex directly via String class

  • Learn key metacharacters: . ^ $ * + ? [] {} () \|