IRIs, URIs and URLs
2011-05-04 16:00
375 查看
转自:http://jero.net/articles/iris-uris-urls
You've probably heard about them before (or at least the last two):
Internationalized Resource Identifier (IRI)
Uniform Resource Identifier (URI)
Uniform Resource Locator (URL)
There's also a high chance that you know what the three are about, but what they exactly are is probably not something most people know. But fear not! This article will give you your answers.
URL is probably the most well known term of the three. What a URL is will probably be something you already know, but what most people don't know is that when we're talking about URLs, we're also talking about URIs. It doesn't work the other way around though, so lets see what a URI really is. But before we do that, lets take a look at what the God of the internet has to say about URIs:
A URI can be classified as a locator, a name, or both. A Uniform Resource Locator (URL) is a URI that, in addition to identifying a resource, provides a means of acting upon or obtaining a representation of the resource by describing its primary access mechanism or network "location". For example, the URL http://www.wikipedia.org/ is a URI that identifies a resource (Wikipedia's home page) and implies that a representation of that resource (such as the home page's current HTML code, as encoded characters) is obtainable via HTTP from a network host named www.wikipedia.org.
So lets summarize that: A URI can be a URL, but it can also be different than giving access to a resource like a webpage. We a URI points to a resource, we call it a URL or a locator for short. However, a URI can also be a name instead of a locator. When a URI is a name we can also call it a URN. If you want more information on URNs, Wikipedia is the place to be.
So a URI can be split into two parts:
URL (locator);
and URN (name).
But when you look at the real world you'll see that when people provide a URI, 99,9% of the time that URI is a URL, not a URN. However, it is recommended that one should use URI instead of URL, so that's what I'll do from now on.
Now that we got that done, lets go to IRIs. IRIs are new, and revolutionary! If you, again, go back to the first paragraph, you will see that the first letter of IRI standards for "Internationalized". What it means is that the IRI always has a Unicode character encoding. Now that is interesting, especially in this time, where the internet and computers are spread around the entire world with a lot of different languages. And as you know, not every language uses the same alphabet as the English language does. French, for example, uses a lot of characters like ê and é. But those characters can not be used in URIs; the standard we use now.
That's why they came up with IRI. It allows you to use any character without percent-encoding (%20 = space) because IRIs are always Unicode. As you can imagine, that really adds a lot of value for languages like Japanese because a Japanese website can use Japanese characters in his IRI which increases the accessiblity if the main language of the document is indeed Japanese. That difference is actually the only difference with the URI standard, but as you see, a very important one. Hopefully we'll be able to use IRIs in the near future because current applications are incapable of handling IRIs, so we need to wait for these bitches to fully comply to the new standard. In other words: don't count on using it within the next couple of years.
You've probably heard about them before (or at least the last two):
Internationalized Resource Identifier (IRI)
Uniform Resource Identifier (URI)
Uniform Resource Locator (URL)
There's also a high chance that you know what the three are about, but what they exactly are is probably not something most people know. But fear not! This article will give you your answers.
URL is probably the most well known term of the three. What a URL is will probably be something you already know, but what most people don't know is that when we're talking about URLs, we're also talking about URIs. It doesn't work the other way around though, so lets see what a URI really is. But before we do that, lets take a look at what the God of the internet has to say about URIs:
A URI can be classified as a locator, a name, or both. A Uniform Resource Locator (URL) is a URI that, in addition to identifying a resource, provides a means of acting upon or obtaining a representation of the resource by describing its primary access mechanism or network "location". For example, the URL http://www.wikipedia.org/ is a URI that identifies a resource (Wikipedia's home page) and implies that a representation of that resource (such as the home page's current HTML code, as encoded characters) is obtainable via HTTP from a network host named www.wikipedia.org.
So lets summarize that: A URI can be a URL, but it can also be different than giving access to a resource like a webpage. We a URI points to a resource, we call it a URL or a locator for short. However, a URI can also be a name instead of a locator. When a URI is a name we can also call it a URN. If you want more information on URNs, Wikipedia is the place to be.
So a URI can be split into two parts:
URL (locator);
and URN (name).
But when you look at the real world you'll see that when people provide a URI, 99,9% of the time that URI is a URL, not a URN. However, it is recommended that one should use URI instead of URL, so that's what I'll do from now on.
Now that we got that done, lets go to IRIs. IRIs are new, and revolutionary! If you, again, go back to the first paragraph, you will see that the first letter of IRI standards for "Internationalized". What it means is that the IRI always has a Unicode character encoding. Now that is interesting, especially in this time, where the internet and computers are spread around the entire world with a lot of different languages. And as you know, not every language uses the same alphabet as the English language does. French, for example, uses a lot of characters like ê and é. But those characters can not be used in URIs; the standard we use now.
That's why they came up with IRI. It allows you to use any character without percent-encoding (%20 = space) because IRIs are always Unicode. As you can imagine, that really adds a lot of value for languages like Japanese because a Japanese website can use Japanese characters in his IRI which increases the accessiblity if the main language of the document is indeed Japanese. That difference is actually the only difference with the URI standard, but as you see, a very important one. Hopefully we'll be able to use IRIs in the near future because current applications are incapable of handling IRIs, so we need to wait for these bitches to fully comply to the new standard. In other words: don't count on using it within the next couple of years.
相关文章推荐
- c5 URLs and URIs - Communicating with Server-Side Programs Through GET
- c5 URLs and URIs - The URL class
- c5 URLs and URIs - The URI Class
- c5 URLs and URIs - Accessing Password-Protected Sites
- c5 URLs and URIs - x-www-form-urlencoded
- URIs, URLs, and URNs: Clarifications and Recommendations 1.0
- c5 URLs and URIs - URIs
- c5 URLs and URIs - Proxy
- iOS 7 之Airdrop 分享 URLs ,Media and Documents
- Introducing ASP.NET FriendlyUrls-cleaner URLs,easier Routing, and Mobile Views for ASP.NET Web Forms
- URLs for publishing and playing live streams over HTTP (FMS)
- What does appending “?v=1” to CSS and Javascript URLs in link and script tags do?
- URLs,URIs,Proxies和Passwords 解析
- Comparison of LDA and PCA 2D projection of Iris dataset
- URIs, URLs, and URNs
- A simple HTTP PHP class to crawl a URL for internal and external URLs
- URIs, URLs, and URNs
- 2-URLs and Resources
- Tip/Trick: Integrating ASP.NET Security with Classic ASP and Non-ASP.NET URLs