You have a wall of text — chat logs, customer notes, scraped HTML — and you need every phone number in it. Here are five practical methods.
Method 1: Online extractor (zero install)
Paste your text into our number extractor, enable Include separators, and get a list of every digit cluster including phone-like patterns.
Best for: one-off extractions, sensitive data, anyone non-technical.
Method 2: Custom regex
# US/international phone numbers (loose)
\+?\d{1,3}?[-.\s]?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}
# E.164 format (strict)
\+\d{8,15}
Run with grep -oE on the command line. Faster on huge files than any online tool.
Method 3: libphonenumber (Google’s library)
Google’s libphonenumber is the gold standard. It validates numbers against country-specific rules.
from phonenumbers import PhoneNumberMatcher
text = "Call me at +1 (415) 555-0123 or 020 7946 0958"
for m in PhoneNumberMatcher(text, "US"):
print(m.raw_string)
Best when: you need validated phone numbers, especially across regions.
Method 4: NLP entity recognition (spaCy, Hugging Face)
For unstructured text where phone numbers are written naturally, an NLP pipeline can pick them out by context. Slower and more setup.
Method 5: LLM (Claude, GPT)
Send the text to a model and ask for a list. Highest accuracy. Costs money, sends your data to a third party, slowest.
Comparison table
| Method | Accuracy | Speed | Privacy | Setup |
|---|---|---|---|---|
| Online extractor | Medium | Fast | ✅ Browser | None |
| Custom regex | Medium-high | Very fast | ✅ Local | Low |
| libphonenumber | High | Fast | ✅ Local | Medium |
| NLP / spaCy | High | Medium | ✅ Local | High |
| LLM | Highest | Slow | ❌ External | Low |
The pragmatic choice
For 80% of cases, the online extractor or a regex is enough. Reach for libphonenumber when you need to validate the numbers. LLMs are a last resort.