Explore our comprehensive guide about advanced search techniques to use Google dorks that will enhance your OSINT research.
Google Dorks or Google Hacking is a technique used by the media, investigative agencies, and security engineers to query various search engines to discover hidden information and vulnerabilities that can be found on public servers.
How does Google Dorking work?
It is an OSINT tool and is not a Google vulnerability or a method to hack a site. It acts as a regular data mining process with advanced features.
The algorithm of any search engine is configured to index absolutely all information. Even though the administrators of the web resources had no intention of publishing this material.
However, the most interesting thing about Google Dorking is the sheer amount of information that can help you learn about the Google search process. In the future, this knowledge can be used to search for distant relatives or to obtain information for their benefit.
What information can be found through Dorks?
You can really find a lot, from remote access controllers to configuration interfaces for critical systems. On the other hand, there is an assumption that no one will ever find a huge amount of information already posted on the network.
How does it happen that such important information gets into the public domain? Let’s take an example. You bought a security camera, installed an application for it, and allowed access to a server that will allow you to watch the image from it anywhere in the world. It sounds very convenient, but on the other hand, you do not know if this server has any protection. This server may not require a password to access the feed from your webcam. It makes this feed publicly available to anyone searching for the text on the camera view page.
Google is ruthlessly efficient at finding any device on the internet that runs on HTTP and HTTPS servers. Because the vast majority of these devices contain a platform to customize them, a lot of things that shouldn’t have been on Google end up there.
It also happens that due to non-compliance with information security, files containing confidential information, such as logins, passwords, database names, etc., are in the public domain.
In this article, we will show how to use Google Dorks for OSINT to show not only how to find all these files but also how vulnerable platforms that contain information in the form of a list of addresses or emails might be.
Search Operators
You can use Dorking, not only on Google. These operators will work just as well on Bing, Yahoo, and DuckDuckGo. An operator is a keyword or phrase that has a special meaning for a search engine. Here are examples of commonly used operators: “inurl,” “intext,” “site,” “feed,” and “language.” Each operator is followed by a colon followed by the corresponding search term.
They allow you to search for more specific information, such as certain lines of text within the pages of a website or files hosted at a specific URL. Among other things, Google Dorking can also find hidden login pages, error messages that give information about available vulnerabilities, and shared files.
Using the “cache:” operator, you can search for deleted or archived pages. It shows a saved (deleted) version of a web page that is stored by Google.
For example:
cache:www.youtube.com
The command allows you to call the full version of the page, the text version, or the page’s source (integral code). It also indicates the exact time (date, hour, minute, second) of the indexing done by the Google spider. The page is displayed as a graphic file, although the search on the page itself is carried out like on a regular HTML page (key combination CTRL + F). The results of running the “cache:” command depend on how often the web page has been indexed by Google.
Useful Google Dorks for personal investigations
Okay, how can you use Google Dorks for personal investigations? For example, you can search for an email address that is associated with a username. If you are the same as the vast majority of people who do not think about security on the Internet, most likely you use the same login for many services. Sometimes, this username may contain some information that we may use. For example, the name Anna2000 will give a hint about the year of birth. There are many other examples, but this is not the most important; it is important for the researcher to understand whether this name was used anywhere else on the Internet.
Let’s look at an example of how to search for information about a person if we only have his username using Google Dorks for OSINT. Let’s take everything tighter, Anna2000. In order to find public e-mail addresses on the Internet that have been used as an identifier, you need to enter the following query into the search engine
Anna2000*com
Search results won’t always show results that are significantly different than if you just searched for Anna2000, but it’s still a great way to find a person’s email address that you can later use for further research.
Disclosure of new contact information from online documents
We can learn a lot by knowing only the name of a person. Since this is a public article, we will not look for documents on a real person. but we will show you the principle by which you can easily repeat this search yourself. In order to find all public documents related to our object of study, you need to enter “John J. Doe” and specify the file type: Excel, Doc, PDF, and so on. This will give us a list of documents that mention our object.
“John J. Doe” filetype:pdf OR filetype:xlsx OR filetype:docx
As you can see, we decided to combine three different search operators to get better results. In the future, this combination will help you save a lot of time and effort. The search results for John won’t give us any serious results because of the hidden identity. But if we search for a real person, we will get documents that are in the public domain. These can be court cases, summaries, fines, contracts, and more, which will allow us to continue our investigation.
Getting information through social networks
In 2018, Facebook published an appeal explaining what the protection of users’ personal data is and, most importantly, what information is still available. So, the social network openly collects and analyzes:
- device data: operating system, software, applications and plug-ins;
- available storage space for files, browser type, types and names of applications and files;
- information about operations: how many and how often each action is performed in a social network tab;
- identifiers (the symbolic or numeric name assigned to the user), as well as accounts from applications;
- device signals: battery level, cellular signal strength, Bluetooth connection, information about nearby Wi-Fi hotspots, radio beacons, and cell towers. Mouse movements on the user’s device and signals from the camera in the Facebook application are also analyzed;
- information about device settings: photo metadata (date and time the image was taken, geolocation, camera model and photo settings, information about its owner), names, and types of files on the device. In addition, information about free disk space and contacts from the address book is available;
- network and connection: mobile operator and provider, language, time zone, mobile phone number, connection speed. Paired devices also “declassify” all information;
- cookies (text fragments sent to the browser from the visited site) that are stored on the user’s device, including ID and settings.
Separately, it is worth noting the WhatsApp messenger, which Facebook owns. It collects contacts, pictures, and information from group chats.
Social networks are just a Klondike of various information we can use in our investigation. And although our John Doe is rarely found on the net, this does not mean that we will not be able to find information about other individuals that interest us.
Example
Let’s try to find Twitter accounts associated with Harvard University. If we dive deeper into the SERPs, we can explore videos, tweets, and more.
To search for information on several social networks, use the OR operator. This will help discover accounts that are linked to our target across multiple platforms.
Among other things, using such a search, you can search for people who have 2 or more accounts and publish the same messages under different accounts. To do this, just enter in the search “phrase or message” site:twitter.com.
This can help expose networks of people working together or multiple accounts managed by one person.
The most popular Google Dorks
- cache: will show the old or deleted version of the sitecache:securitytrails.com
- allintext: Looks for specific text that is on the page allintext: hacking tools
- Allintitle: same as dork above, but only for titles. allintitle:”Security Companies”
- filetype: can be used to search for any file extension: email security filetype: pdf
- Site: will show you the complete list of all indexed URLs for the given domain and subdomain. site:securitytrails.com
- *: allows you to search for anything before a word, e.g., how to * a website, will return “how to…” design/create/hack, etc… “a website.”
- “security” “tips” will show all the sites which contain “security,” “tips,” or both words.
- +: used to concatenate words, useful to detect pages that use more than one specific key, e.g., security + trails
- –: minus operator is used to avoid showing results that contain certain words, e.g., security -trails will show pages that use “security” in their text, but not those that have the word “trails.”
More examples and search operators can be found in The Google hacking Database (GHDB) If you want to automate the search process? you can use the following programs.
Best 8 Google Dorks Open Source Projects
- Pagodo – Passive Google Dork – automates Google searching for potentially vulnerable web pages and applications on the Internet. It replaces manually performing Google dork searches with a web GUI browser.
- Zeus Scanner is an advanced reconnaissance utility designed to simplify web application reconnaissance.
- Go Dork – The fastest dork scanner written in Go.
- Sitedorks – Search Google, Bing, Ecosia, Yahoo, or Yandex for a search term with several websites. A default list is already provided, containing Github, Gitlab, Surveymonkey, Trello, etc. Currently, a default list of 518 dorkable websites is available.
- DorkScanner – A typical search engine dork scanner that scrapes search engines with queries that you provide in order to find vulnerable URLs.
- Evildork – Dork only has one specific domain or all subdomains available. Dork is targeting a general target (likely to be a person). Produce an HTML output result page with all the dorks links. Be aware that Google, after some tries, will add a captcha to check if you’re a bot or not.
- Google Dorks Full List – Approx 10.000 lines of Google dorks search queries! please initiate a pull request in order to contribute and have your findings added!
It is an illegal act to build a database with Google Dorks. Only use this for research purposes! exploiting these search queries to obtain data leaks, databases, or other sensitive information might cause you a lot of trouble and perhaps even jail.