OT: Virus infection mechanisms

- D
- Don Y
  
  Contact options for registered users
posted
7 years ago

Thu, Apr 28, 2016 3:51 PM

Sorry, couldn't think of a better forum...

I suspect antivirus tools just check for "characteristic signatures" in files and, from that, deduce that the file is "infected". The thinking being that these combinations of bytes don't appear in "normal" software.

Assuming that to be true:

- how long are these signatures? i.e., do the tools check for a fixed number of bytes in all "programs"? Or, do they effectively verify the presence of the entire viral payload?

- for DLL infections, do they hook DLLmain? Or, pick some "random" (all?!) entry point in the library and "hope for the best" (worst)?

Finally, is it TYPICALLY an exercise in futility to try to remove/"comment out" the viral load (if it is always an initial stanza, then one should be able to just "skip past it"; if tightly interwoven in the code, "ain't gonna happen")?

Pointers to literature?

- T
- Teodor V.
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Thu, Apr 28, 2016 5:17 PM

I don't have pointers to litterature, but I know that some antivirus software generates hashes (MD5, SHA-1 and others) from known instances of malware.

Not only from the file itself, but from parts from the file, like segments for executables, and files contained in executables.

Another method used is heuristic scanning, like if you make a HTML file which reads

formatting link

up front in a hyperlink, but links you to

formatting link

it will cause a trap or at least warning.

A third method is to look for intentionally misspelled words, like you remember how the sildenafil spam used v14gra et al permutations to get past spam filters (which basically are a niche application of the same technology that is used to scan for malware).

A virus database basically is a collection of signatures (either MD5, SHA-1 et al as well as scripted heuristics for known instances of malware which encrypts itself), signed with strong encryption so the software is less likely to be tricked to recognize a legit program as a threat.

As to litteature, come to think of it, I think I remember reading some of this in a document regarding the ClamAV scanning engine, so that would be a good starting point for your Google-Fu practice :3

/Teo.

--

teostupiditydor@algonet.se | for you are good and crunchy with 
Remove stupidity to reply  | ketchup.

- C
- Clifford Heath
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Thu, Apr 28, 2016 9:29 PM

snip

You aren't going to find the truth about what commercial virus scanners actually do. It's their most highly confidential and valuable trade secrets, and is extremely well-defended.

Clifford Heath.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Fri, Apr 29, 2016 5:31 AM

Yes, but surely (?) not of the entire file? I.e., they'd have to store the hash of foo.exe infected with malware1, foo.exe infected with malware2, foo.exe infected with malware3... bar.dll infected with malware1, ...

And, would they then treat each such datapoint as a tuple: (filename, infection, hash) Would altering filename then complicate detection by breaking the typle? Or, just a bunch of (infection, hash) typles expecting the hash space to be sparse, even in the face of countless possible filenames?

So, prepending "This is not an executable" to a file (making it so it wouldn't be loadable) would alter the hash of the file -- just as "Here's something else I want to try" would? (i.e., chances are, they've never encountered foo.exe thusly modified!)

So, they have to understand the structure of the file containing the potential payload -- enough to be able to isolate the "active" part.

A payload in a JPG (that targets a flaw in some particular image viewer) would be different than in an EXE, etc.?

Yes, by watching how it behaves when "activated".

None of this tells me how to isolate a payload in a particular file and defeat it (excise that code stanza, replace it with NoOps, etc.)

And, presumably, AV products don't remoe the threat; they remove the carrier!

Thanks, I'll have a look.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Fri, Apr 29, 2016 5:33 AM

I'm not interested in how AV scanners work -- except as a tutorial on how *I* can locate a payload in a particular file.

Obviously, an executable that is a threat to a Windows/x86 box is not likely to be a threat to a SPARC machine -- even though the SPARC owner might want to know that the file is "not as it should be".

- C
- Clifford Heath
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Fri, Apr 29, 2016 6:43 AM

There is a number of effective techniques for efficiently scanning for any of many given bitstrings (of extended lengths), at any offset in a suspect file. One that I'm aware of uses a "rolling checksum" like an Adler32; one such checksum for each target bitstring length. So for fast scanning, normally the dictionary would be normalized to just one or two lengths, and when the rolling checksum fires (by a match in a hash table), the full bitstring gets checked at this location.

"rsync" uses a similar mechanism to find likely block matches, which are then checked by a stronger checksum (MD5, SHA1, etc). We used it in my related patent as well.

There's been an issue with aeroplane cockpits getting infected with Android malware (that cannot harm the plane) when service folk use the USB sockets to charge their phones. The infection can be passed on to other Android devices, which is the actual risk.

Clifford Heath.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Fri, Apr 29, 2016 8:48 AM

Again, that only tells me how to locate a specific *signature* in a file. It doesn't tell me how to find the likely location of an infection in a file and excise the infection from the file (e.g., with a hex editor).

It's easy to see how malware can *deliver* a payload to a system. But, if you assume a virus wants to replicate, then it needs a way of attaching itself to "suitable files" (most obviously executables) in such a way that the virus doesn't need to understand much about what the file is doing.

For example, prepending itself to an executable so that control transfers to the malicious code *first*, allowing it to "do whatever"; then chaining to the original "appended" executable so the user is unaware that anything unexpected has just happened.

It would, I imagine, be difficult to insert malware at a "random" location in an executable -- though could imagine "enough smarts" allowing it to locate startup code and install itself in some portion of the image KNOWING PARTICULARS ABOUT THE TYPICAL IMAGE FORMAT for that platform.

(I suspect simpler is the safer approach -- for most malware. Too much bloat makes an executable "suspicious")

- H
- Hans-Peter Diettrich
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Fri, Apr 29, 2016 10:36 AM

Don Y schrieb:

PE program files come with verbose names of system calls in their fixup tables, so that it's easy to find useful locations in such a file.

DoDi

- J
- jurb6006
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Fri, Apr 29, 2016 12:11 PM

I think they're hiding more than that. I'll l eave it at this :

Who are the only ones who profit from the propagation of a virus ?

The only ones.

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Sun, May 1, 2016 3:53 AM

Viruses which spread by infecting arbitrary files in the hope that such a file will eventually be copied to another system are fairly uncommon nowadays. They have largely been replaced by either

a) worms, which typically infect a specific subsystem (usually related to some form of communication channel), then use that to spread to other systems, or

b) simple malware spread using a top-down approach; compromised systems are used to hijack web or email servers which will then distribute malware to PCs, smartphones etc. Each particular type of malware typically distributes a different type of malware to a different type of system, rather than self-replicating.

Infection doesn't generally attempt to "splice" code into arbitrary executables or DLLs, but either replaces standard executables (or DLLs, etc) with an infected version or installs entirely new files which are then activated by modifying the registry or other configuration mechanisms.

The malicious file may have some degree of polymorphism in order to frustrate detection, but this is unlikely to take the form of merging with a pre-existing version of the file. Rather, any such file will be generated from a "template" which includes specific mechanisms for polymorphism.

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Sun, May 1, 2016 4:29 AM

And who are the ones who profit from criminal behavior? The legal system! There are obvious (but wrong) conclusions in both cases.

--

Rick C