Search Engines

1.2. Search Engines

During the Business related information gathering phase, there is a great deal of diverse research conducted and are as follows:

                      -------------> Web Presence -------------
                    /                                           \
                   /                                             V  
        Cached and Archival Sites                    Partners and Third Parties
                  ^                Public Information            |
                  |                                              V  
              Harvesting                                  Job Posting
                  ^                                             /
                   \                                           /
                     ------------Financial Information <------

1.2.1. Web Presence

In this phase, you will learn a great deal more about your target including:

  • What they do

  • What is their business purpose

  • Physical and logical locations

  • Employees and departments

  • Email and contact information

  • Alternative web sites and sub-domains

  • Press release, news, comments, opinions

Sources that you can get the data from:

You may have probably notices by now that this process is not set in stone and is never the same for all the organizations. Organizations belonging to different industries can be investigated through search in different publicly available databases. Compliance and regulations might force companies to publish different kind of information publicly.

An example is publicly traded companies that have to file their financial documents to SEC database. For this purpose, you can use the EDGAR (Electronic Data Gathering, Analysis, and Retrieval System).

1.2.2. Partners and Third Parties

Other information that you can gather about the company a re mergers acquisitions, partnerships, third parties, etc.

With these you can deduce what type of technologies and systems they use internally.

1.2.3. Job Posting

From job postings we can deduce internal hierarchies, vacancies, projects, responsibilities, weak departments, financed projects, technology implementations and more.

Job posts websites:

  • LinkedIn

  • Indeed

  • Monster

  • Careerbuilder

  • Glassdoor

  • Simplyhired

  • Dice

1.2.4. Financial Information

With a company's financial information, you can easily find out if the organization:

  • is going to invest in a specific technology

  • might be subject to a possible merge with another organization

  • has critical assets and business services

Tools:

  • Crunchbase You can find information about:

    • Companies

    • People

    • Investors and financial information Anyone can edit the information in it

  • Inc Inc. focuses its attention on growing companies and provides advice, resources, and information to companies. It offers a list of the 500/5000 fastest-growing private companies, showing very useful information and statistics to them.

1.2.5. Harvesting

In this phase, we unpack methods for gathering company documents such as charts (detailing the company structure), database files, diagrams, papers, documentation, spreadsheets, and so on. This is the right time to begin harvesting emails accounts (Twitter, Facebook, etc.), names, roles, and more.

It is important to know that when a document is created, it automatically stores information (metadata) like who created it, date and time of creation, software used, computer name, and so on.

If we are able to retrieve documents online and inspect the underlying metadata, we can extract useful information.

1. Google Dorks

We can use this following google filters:

site:[website] and filetype:[filetype]

This will narrow down the results and display only the links to files with the [filetype] extension and stored in the [website]

2. FOCA

Doing this manually can be very tedious and time consuming. A very useful tool that allows us to automatically find and download files is FOCA

By querying engines like google and bing, FOCA is able to retrieve files and then attempt to extract metadata such as names, usernames, passwords, OS, etc.

Note that this tool works only on Windows unfortunately.

FOCA allows us to download and extract infrastructure information as well as business information, but now we are only going to pay attention to the business information.

3. theHarvester

Thanks to search engines and social networks, theHarvester is able to enumerate email accounts, usernames, domains, and hostnames.

Once we have the too installed on our machine, we can run the following command in order to retrieve information about elearnsecurity.com:

theharvester -d elearnsecurity.com -l 100 -b google

where:

  • -d is the domain or the company to search

  • -l limits the results to the value specified

  • -b is the data sources (google, linkedin, bing, etc)

1.2.6. Cached and Archival Sites

Since information on the web changes so quickly, sometimes seeking an older version of a site could provide useful to our cause.

Consider a job post. If the organization deletes it from the website, you will "lose" that information; if you could see the webpage, before the update, you could harvest that information. Turns out this is entirely possible through cache and archival technology.

Tool:

  • archieve.org

  • google dork (cache:URL)

Remember Logging!!

Last updated