Internet and e-mail policy and practice
including Notes on Internet E-mail


2011
Months
Aug

Click the comments link on any story to see comments or add your own.


Subscribe to this blog


RSS feed


Home :: Internet

26 Aug 2011

The design of the Domain Name System (Part II) - Exact and approximate name matching Internet

In the previous installment, we looked at the overall design of the DNS. Today we'll look at the ways it does and does not allow clients to look up data by name.

The most important limitation of the DNS, compared to other databases, is that it only does exact match lookups. That is, with a few minor exceptions, the name in the query has to match the name of the desired records exactly. One exception is folding of upper and lower case characters, which has little effect, the other is DNS wildcards.

Wildcards have always been part of the DNS, but the details of their definition have been confusing. The definition was clarified by RFC 4592 in 2006.

Wildcards provide a very constrained form of pattern matching, telling the server to synthesize records for all nodes below a specified node that don't have explicit records. That is, if there is a DNS record named *.foo.example, it will match requests for something.foo.example, so long as there aren't any records with that explicit name. A single star as the leftmost component of a domain name is the only form of wildcard; stars anywhere in a name are are just normal characters.

In practice, this turns out to be of limited use, with typical applications being web servers that catch any variation of their name, e.g. http://anything.example.com, and mail systems that give each user a separate domain, e.g., mailbox@anyuser.example.com. Wildcards do not work with prefixed names, such as _attribute.*.example.com, nor are they useful to handle ranges of queries except in some very stylized cases.

Some applications have proposed sequences of multiple queries to simulate range queries. For example, DNS blacklists (DNSBLs) map IP addresses into DNS names using a modified version of the mapping used for reverse DNS. If a DNSBL is called dnsbl.example, the entry for the IP adresss 12.34.56.78 would be 78.56.34.12.dnsbl.example.

When a DNSBL wants to list a range of IP addresses, it needs conceptually to include a record for each name corresponding to an IP address in the range. For DNS servers that use traditional master files, since each component in the name represents eight bits of the IP address, this involves breaking an IPv4 range into a minimal covering set of blocks on eight bit boundaries, adding wildcards for each block, and an individual entry for each individual address not in a larger block.

Some people have suggested approaches to try to optimize range listings by querying for prefixes of a desired address, e.g., if the address is 1.2.3.4 and the name is 4.3.2.1.dnsbl.example, query for 2.1.dnsbl.example to see of any of the 1.2.xx.xx range of addresses are listed. According to DNS rules, the query should return NXDOMAIN if there are no entries in the range, or NODATA if there are some. While this technique might work, it is quite fragile, due to DNS servers that don't correctly distinguish between NXDOMAIN and NODATA responses, currently including the most popular DNSBL server rbldnsd. Also, at this point there is no evidence that the probes really would save queries or cache entries compared to just querying for each address as needed. In principle, a DNS cache could synthesize its own NXDOMAIN responses for names below existing NXDOMAIN (anything in *.2.1.dnsbl.example here), but again, it's fragile, and as far as I know, no widely used DNS cache does that other than as an experiment.

As a general rule, a successful DNS application makes one query, or at most a small bounded number of queries for each application call. Note that the issue of DNS range queries is separate from that of application ranges. Most notably, the NAPTR RRTYPE, defined in RFC 3403, used to find servers for things like telephone numbers, includes a string which is interpreted by applications as a regular expression to be matched against a source string to find a domain for a subsequent lookup. While one can debate the wisdom of the rather complicated application design of which NAPTR is a part, it does not involve any pattern matching in the DNS. The NAPTR lookup algorithm makes a small set of specific DNS queries which the DNS handles without difficulty. It does involve potential provisioning problems, since regular expressions include a lot of special characters and escape sequences, something that few other RRTYPEs include and whose handling by provisioning software may not be well debugged.

In the next installment, we'll look at the way delegation of parts of the DNS works, and how that affects the way applications use it.


  posted at: 14:02 :: permanent link to this entry :: 0 comments
Stable link is https://jl.ly/Internet/dnsdesign2.html

Topics


My other sites

Who is this guy?

Airline ticket info

Taughannock Networks

Other blogs

CAUCE
It turns out you don’t need a license to hunt for spam.
5 days ago

A keen grasp of the obvious
Italian Apple Cake
563 days ago

Related sites

Coalition Against Unsolicited Commercial E-mail

Network Abuse Clearinghouse

My Mastodon feed



© 2005-2020 John R. Levine.
CAN SPAM address harvesting notice: the operator of this website will not give, sell, or otherwise transfer addresses maintained by this website to any other party for the purposes of initiating, or enabling others to initiate, electronic mail messages.