Comprehensive URL validation regex pattern for HTTP and HTTPS URLs with domain validation
^
Start of stringhttps?
HTTP or HTTPS protocol:\/\/
Protocol separator(www\.)?
Optional www subdomain[-a-zA-Z0-9@:%._\+~#=]{1,256}
Domain name characters (1-256 length)\.[a-zA-Z0-9()]{1,6}
Top-level domain (1-6 characters)\b
Word boundary([-a-zA-Z0-9()@:%_\+.~#?&//=]*)
Optional path, query parameters$
End of stringconst urlRegex = /^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)$/;
console.log(urlRegex.test("https://example.com")); // true
console.log(urlRegex.test("http://www.site.org/path?query=1")); // true
import re
url_pattern = r"^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)"
print(bool(re.match(url_pattern, "https://example.com")))
print(bool(re.match(url_pattern, "http://www.site.org/path?query=1")))
import java.util.regex.*;
public class UrlRegex {
public static void main(String[] args) {
String urlRegex = "^https?:\\/\\/(www\\.)?[-a-zA-Z0-9@:%._\\+~#=]{1,256}\\.[a-zA-Z0-9()]{1,6}\\b([-a-zA-Z0-9()@:%_\\+.~#?&//=]*)$";
System.out.println(Pattern.matches(urlRegex, "https://example.com"));
System.out.println(Pattern.matches(urlRegex, "http://www.site.org/path?query=1"));
}
}
https://example.com
Validhttp://www.site.org/path?query=1
Validhttps://sub.domain.co.uk/path/to/page
Validftp://example.com
Invalidnot-a-url
Invalidhttp://
InvalidValidate URL format when users submit links to ensure the basic format is correct and avoid access errors caused by invalid links.
// Link format validation
if (!urlRegex.test(userUrl)) {
showError("Please enter a valid URL");
}
Identify and validate URLs in user-generated content for content moderation, link security checks, and preventing malicious link propagation.
// Extract URLs from content
const urls = content.match(urlRegex) || [];
urls.forEach(url => validateSafety(url));
Validate and filter URLs in web scraping programs to ensure only valid URLs are processed, improving crawler efficiency and data quality.
// Crawler URL filtering
const validUrls = urlList.filter(url =>
urlRegex.test(url)
);
Validate URL parameter formats in API interfaces to ensure received URL parameters meet expected formats, improving interface robustness.
// API parameter validation
if (!urlRegex.test(req.body.callback_url)) {
return res.status(400).json({error: "Invalid URL"});
}
URL validation may fail due to: missing protocol prefix (http:// or https://), containing special characters, incorrect domain format, or incorrect port number format.
Correct format: https://www.example.com, http://localhost:3000/path
Incorrect format: www.example.com, ftp://example, http://
Regular expressions can only validate URL format, not confirm if the URL actually exists. To verify accessibility, you need to send HTTP requests to check response status.
// Check URL accessibility
fetch(url).then(response => {
console.log('Status:', response.status);
}).catch(err => console.error('URL not accessible'));
Basic URL regex patterns usually don't support internationalized domain names. For Chinese domains and similar, it's recommended to use specialized URL parsing libraries or more complex regex patterns.
Example: http://中文域名.com requires special handling for proper validation.
Special characters in URLs need to be encoded. Spaces should be encoded as %20, and Chinese characters need UTF-8 encoding. URLs can be normalized before validation.
Validate email address format
Validate IP address format
Validate phone number format
Validate date format