Splunk Cheat Sheet

Here’s a cheat sheet for regular expressions (regex) commonly used in Splunk:

Character Classes

  • .: Any character except a newline.
  • \d: Any digit (equivalent to [0-9]).
  • \D: Any non-digit.
  • \w: Any word character (equivalent to [a-zA-Z0-9_]).
  • \W: Any non-word character.
  • \s: Any whitespace character.
  • \S: Any non-whitespace character.

Anchors

  • ^: Start of a line.
  • $: End of a line.
  • \b: Word boundary.

Quantifiers

  • *: Zero or more occurrences.
  • +: One or more occurrences.
  • ?: Zero or one occurrence.
  • {n}: Exactly n occurrences.
  • {n,}: n or more occurrences.
  • {n,m}: Between n and m occurrences.

Character Escapes

  • \: Escape special characters (e.g., \., \\).

Character Sets

  • [abc]: Any one of the characters a, b, or c.
  • [^abc]: Any character except a, b, or c.
  • [a-z]: Any lowercase letter.
  • [A-Z]: Any uppercase letter.
  • [0-9]: Any digit.

Grouping and Alternation

  • (abc): Grouping.
  • a|b: Matches either a or b.

Quantifiers with Lazy Matching

  • *?: Zero or more occurrences (lazy).
  • +?: One or more occurrences (lazy).
  • ??: Zero or one occurrence (lazy).
  • {n}?: Exactly n occurrences (lazy).
  • {n,}?: n or more occurrences (lazy).
  • {n,m}?: Between n and m occurrences (lazy).

Lookahead and Lookbehind

  • (?=...): Positive lookahead.
  • (?!...): Negative lookahead.
  • (?<=...): Positive lookbehind.
  • (?<!...): Negative lookbehind.

Common Patterns

  • \b\d{3}-\d{2}-\d{4}\b: Social Security Number.
  • \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b: Email address.
  • \b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b: IP address.
  • \b(?:\d{1,3}\.){3}\d{1,3}\b: Another form of IP address.
  • \b(?:https?|ftp):\/\/\S+\b: URL starting with http, https, or ftp.

Remember to adjust these patterns based on your specific use case and data. Regular expressions can be powerful but also need to be carefully crafted to match your data accurately.