Reverse Engineering

xWorm Extractor - Extracting Configs Without a Sandbox

Extracting xWorm configs from the binary without the requirement of running it.

Dave Addison

02 Feb 2025 — 9 min read

I read a post whereby an actor had compromised thousands of script kiddies by weaponizing a builder application for the xWorm stealer/trojan. Given a recent attempt to rekindle my interest in the technical side of security, today we are going to rip apart the binary and see if we can build a config extractor.

What started as a simple "lets make a Yara" rule, spanned into "lets make rules to detect other variants" which eventually evolved into a complete Python script to extract the configs of each individual build.

Story below for those who want, for those who've got better shit to do with their time, here's the script in GIT.... happy triage automation!

I'm not a dev so don't expect adequate error handling etc. Its just a PoC to assist people with malware triage. If you can do it cleaner and faster then please do!!!! I take great pride if someone builds off of something I've done!! Let me know.

What is the xWorm Builder

Running on Windows as a C2, the application runs with a basic config. Specifying the port and key of the C2. This is a screenshot from a cracked version by UnknownHat127001.

Once port specs are done you're greeted with your standard UI. One option of which is a builder...

Here we can specify multiple IPs, domains etc to reverse connect to. The port (of which should obviously match what you just set on the previous screen). The key which I assume facilitates communications; and the group which I'm also assuming may allow you to segregate campaigns but share the same host.

Within settings there are the following options:

Anti Kill - An attempt to stop termination
WDEX -
Keylogger - Installs a keylogger (duh!)
Anti Analysis - Implements anti debugger techniques
TBotNotify - Telegram Bot posting.
Clipper - Not played with but I believe it attempts to modify clipboard crypto contents.

Additionally we can choose up to three different persistence options 'Registry', 'SchTasks' and 'Startup'. These are not really configurable. The code for each is fairly static as we will see in a bit and fairly easy to detect.

Sleep timer for the time between call backs to the C2. No options for jitter.

USB option for USB infection... didn't play with it but apparently its a thing.

The last page gives you options to obfuscate the file, which is not as obfuscated as you would like, again, we will see this in a bit. 'Assembly' seemingly allows you to bind an exe to the payload. And icon.... because hey... we all gotta look pretty like PDFs instead of exe's.

First Payload

Keep It Simple Stupid

Starting simple we can got hold of a basic build from Malware Bazaar. The hash of the filw we are using here is "95e1104df5d9080402316949de1137c886f9d53d884cee12d10af499f41d5ac1"

Using Floss we can extract the UTF-8 and UTF-16-le strings. Its basically strings.exe on steroids. First indicator that they may have obfuscation enabled is a warning from Floss. You can run this without deobfuscating for the time being.

So in the UTF-16 segment we have multiple strings which as it happens do not change. They remain there regardless of the options enabled. So for our first notes...

There are three hardcoded user agents:

And a hard coded content length for an HTTP header? This is also fairly unique

Spoiler/TLDR alert... those two strings independently will be present in EVERY build regardless of any options configured. So a yara rule is available if you want it.

rule MAL_XWorm_RAT {
   meta:
      description = "Detect XWorm via multiple hardcoded user agent strings present in binaries as well as a hardcoded content length HTTP header field."
      author = "Dave Addison"
      date = "2025-01-25"
      SHA256-1 = "95e1104df5d9080402316949de1137c886f9d53d884cee12d10af499f41d5ac1"
	  SHA256-2 = "c77420f9b9a1c6dc4dfc36f2b72c575fb882339286c14bb85b79e86b2c2486bc"
   strings:
      $s1 = "Content-length: 5235" ascii wide
      $s2 = "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0" ascii wide
      $s3 = "Mozilla/5.0 (iPhone; CPU iPhone OS 11_4_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.0 Mobile/15E148 Safari/604.1" ascii wide
      $s4 = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36" ascii wide
   condition:
      all of them
}

Its all well and good having a way to detect the binary, but what about getting the details out? What configuration options were selected? What's the C2? What's the Port? Group membership? Telegram Bot details?

What about the configs?

Lets grab them all

Firstly a note about the attempts to obfuscate the strings. They're all hardcoded in the binary, but obfuscated using a mix of B64 and encryption. See all the configs for hostname/IP, port and Group membership etc. Looking at this using DNSpy...

If you just slap these through a B64 decode it will be useless. That's because they're decrypted in the main block.

So how are they decrypted? Well the decrypt function is called:

And I want to draw attention to the line containing 'Mutex' of which, this appears to be generating a decryption key with.

'Mutex' is in the original settings block. So the key is there... its hardcoded... can we just pull it from the binary without running it? Well yes. Using CLR and the concept that this block of settings does not change, we can pull them all using a simple static token value 0x04000006 through to 0x04000011.

They're even named for us so we can identify Mutex straight away. Pull it out... create our own decrypt function in Python:

And voila.... we can now see the configs

Hosts: 127.0.0.1,develop-versions.gl.at.ply.gg,develop-versions.gl.at.ply.gg:65059,have-lucia.gl.at.ply.gg
Host: Not Set
Port: 65059
KEY: <123456789>
SPL:
Group: jjsploit
USBNM: USB.exe
InstallDir: %AppData%
InstallStr: explorer.exe
Mutex: F3euhgbNmjl7pCCb

Yooooo I managed 700+ words before using a gif!

What about the other stuff?

Defensive bypasses, telegram bots etc

Well we can do all that with strings. So here is the logic for detecting this.

We have two forms of strings output. UTF16-le and UTF8. Now it does not matter which options you pick... you can detect they're enabled via strings. EVEN when obfuscated.

So to detect the three persistence settings:

for string in extracted_stringsUTF16:
    
    if 'SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run' in string: 
        print("Registry Persistence Enabled")
    
    if '/create /f /RL HIGHEST /sc minute /mo 1 /tn ' in string: 
        print("Scheduled Task Persistence Enabled")
    
    if 'WScript.Shell' in string: 
        print("Startup Folder Persistence Enabled")

And for the additional configurations. Anti analysis drops strings, as does WDEX and anti kill, the only exception to logic here is choosing to choose that the keylogger was enabled only by the absence of the string "OfflineKeylogger Not Enabled"

if "RunAntiAnalysis" in extracted_stringsUTF8 and "DetectSandboxie" in extracted_stringsUTF8:
        print("Anti-Analysis is enabled")
    if "/sendMessage?chat_id=" in extracted_stringsUTF16:
        print(f"Telegram Bot Configured is configured\n\tChat ID: {next((value for item, value in appData if item == 'ChatID'), 'Not found - If obfuscated it will be shown further up')}\n\tToken: {next((value for item, value in appData if item == 'Token'), 'Not found - If obfuscated it will be shown further up')}")
    if "OfflineKeylogger Not Enabled" not in extracted_stringsUTF16:
        print("Keylogger enabled")
    if "WaitForExit" in extracted_stringsUTF8 and "Exclusion" in extracted_stringsUTF8 and "get_ModuleName" in extracted_stringsUTF8:
        print("WDEX Enabled")
    if "CriticalProcess_Enable" in extracted_stringsUTF8 and "ProcessCritical" in extracted_stringsUTF8 and "needSystemCriticalBreaks" in extracted_stringsUTF8:
        print("Anti kill enabled")

If telegram is detected, the token and chat id will be pulled. In the event that the file is obfuscated this will appear as a block of data (which I will go over now)

Obfuscation Barriers

Makes it awkward, not impossible.

So obfuscation methods used by xWorm are not ideal, but they were enough to stop this method of extracting the data in nicely named blocks. Now I don't know what the Mutex is ☹️

So we are back at the questions:

How do we identify the Mutex string?
How do we identify relevant strings?

Well lets go back to string analysis again. We can take a peak at the settings config using DNSpy from a known obfuscated binary. Alas the binary we used prior didn't have obfuscation enabled so I made one to demo with quickly.

We can see the settings object is now replaced with its obfuscated counterpart

There are obfuscated config names so we cant see what's what, we can see however that the Mutex is still there "cIT6QTOthjWV8guD"

What about the other string values.... are they in the strings outputs? Well as it happens, they're all sitting next to one another in a strings output...

So we have the values but not the names of the vars. All useless though unless we can get the Mutex to run decryption. What do we know programmatically that can nab the Mutex?

Its 16 characters long, and that at some point it will be able to decode a port number.

Why focus on the port number? Because its always in the same position ("hosts", "host", "port") and its required regardless of whether the user uses IPs or domains. Ports need to be specified otherwise nothing can make a connection.

So after brute forcing decryption using every string, against every string.... we are left with 3 that, at some point, ran the decrypt function and returned an integer value. This indicating that whilst acting as a mutex, they may have decrypted the port string.

Out of those three, one was 16 characters long.... our Mutex.

# loop through and use strings to decrypt other strings. Looking for a number which will be a port number (always set and always a number)
        for item in array:
            for item2 in array:
                try: 
                    val = decrypt(item,item2)
                    if val.isnumeric():
                        possibleMutex.append(item2)
                except:
                    continue
        
        print(f"[+] Found {len(possibleMutex)} potential encryption strings")
        
        # We may match the sleep integer and other numerics not port related.
        print("[+] Checking top stings for best chances")
        
        # Mutex is always 16 chars
        for potential in list(possibleMutex):
            if len(potential)!=16: possibleMutex.remove(potential)

Now we can use out mutex against the strings we find.... attempt a decrypt.... if successful print to screen.

There could be a logic here to determine which config is which... but use your bonce and figure it out. This screenshot actually shows how the Telegram bot API details end up in this output untagged.

Going back to the original malware we got from Malware Bazaar.... here's what config we can pull...

Some IOCs for closure and for future reference.

95e1104df5d9080402316949de1137c886f9d53d884cee12d10af499f41d5ac1
develop-versions.gl.at.ply[.]gg
develop-versions.gl.at.ply[.]gg:65059
have-lucia.gl.at.ply[.]gg