How do I analyze text that doesn’t have a separator (eg a domain name)?

5 pts.
Tags:
csrss
I have a bunch of domain names without the tld I'd like to search but they don't always have a natural break in between words (like a "-"). For instance:
techtarget
americanexpress
theamericanexpress // a non-existent site
thefacebook
What is the best analyzer to use? e.g. if a user types in "american ex" I'd like to prioritize "americanexpress" over "theamericanexpress". A simple prefix query would work in this particular case but a user then types in "facebook" but that doesn't return anything. ;(


Software/Hardware used:
server
0

Answer Wiki

Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.

Discuss This Question: 2  Replies

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.
  • carlosdl
    You have to tell us what language or software you are trying to do this on, but regardless of that, the first thing that needs to happen is to define clear rules about how to get the results you are after.

    For example:  how exactly do you know that "americanexpress" must be prioritized over "theamericanexpress"? what do you mean by "a non-existent site"?  non-existent where?
    84,975 pointsBadges:
    report
  • ToddN2000
    A few more details would help. Like where is the data coming from? What languages are you familiar with ? What do you need to validate the data ?  What is the intended use of the data you need to verify? It almost sounds like you are trying to design your own search engine results based on some kind of input file.  
    134,445 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

Thanks! We'll email you when relevant content is added and updated.

Following

Share this item with your network: