What is the difference between the gethost and getauthority methods in the URL class in Java?

2020-03-16 • Java

I have a series of different forms of strings (URLs):

> http：// domain name. anything / anypath > https：// dmain name. anything / anypath > http：//www.domain name. anything / anypath > https：//www.dmain name. anything / anypath

These strings are saved in a CSV file I need to parse every URL to get the domain name Everything That is, the part after the first one Before the first /

I use the split method to separate strings, then convert each string to a URL, and then use the toauthority function to get only the domain name The problem is that for me, institutions and hosts do the same work, including what I don't want Although, in Oracle tutorial, it seems that toauthority should return a domain name without www

How to extract the domain name without www URL?

Solution

To really understand this, you should read URI specification – RFC 2396

The short answer is that the permission component consists of the host component and optional port number, user name and password... Depending on the URL scheme used

You call gethost () to test whether it starts with the string "www" If it, you delete it

But before you start doing this, you need to understand that deleting "www." may provide you with an invalid URL or resolve to a document or service different from the document or service to which the original URL is resolved Collecting URLs for free is a bad idea... Unless you know more about the organization of the website

"Foo. Com" and "www.foo. Com" are the same local conventions. They are just a convention. Many websites have not implemented it Deleting "www." would be a bad idea because it might convert a resolvable URL to an unresolved URL

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.

THE END

Java

二维码

Android viewpage implements controllable no sliding

< <上一篇

Java gets the PID of the current process

下一篇>>

搜索内容

What is the difference between the gethost and getauthority methods in the URL class in Java?

Solution

热门文章