17 It’s damned cold outside, so let’s light ourselves a fire!

image

Des Teufels liebstes Möbelstòck ist die lange Bank. эTO самиздат.
Compiled on December 20, 2017. Free Radare2 license included with each and every copy!
€0, $0 USD, $0 AUD, 0 RSD, 0 SEK, $50 CAD, 6 × 1029 Pengő (3 × 108 Adópengő), 100 JPC.

17:02 Constructing AES-CBC Shellcode

by Albert Spruyt and Niek Timmers

Howdy folks!

Imagine, if you will, that you have managed to bypass the authenticity measures (i.e., secure boot) of a secure system that loads and executes an binary image from external flash. We do not judge, it does not matter if you accomplished this using a fancy attack like fault injection 0 or the authenticity measures were lacking entirely.1 What’s important here is that you have gained the ability to provide the system with an arbitrary image that will be happily executed. But, wait! The image will be decrypted right? Any secure system with some self respect will provide confidentiality to the image stored in external flash. This means that the image you provided to the target is typically decrypted using a strong cryptographic algorithm, like AES, using a cipher mode that makes sense, like Cipher-Block-Chaining (CBC), with a key that is not known to you!

Works of exquisite beauty have been made with the CBC-mode of encryption. Starting with humble tricks, such as bit flipping attacks, we go to heights of dizzying beauty with the padding-oracle-attack. However, the characteristics of CBC-mode provide more opportunities. Today, we’ll apply its bit-flipping characteristics to construct an image that decrypts into executable code! Pretty nifty!

image

Cipher-Block-Chaining (CBC) mode

The primary purpose of the CBC-mode is preventing a limitation of the Electronic Code Book (ECB) mode of encryption. Long story short, the CBC-mode of encryption ensures that plain-text blocks that are the same do not result in duplicate ciphertext blocks when encrypted. Below is an ASCII art depiction of AES decryption in CBC-mode. We denote a cipher text block as CTi and a plain text block as PTi.

image

An important aspect of CBC-mode is that the decryption of CT2 depends, besides the AES decryption, on the value of CT1. Magically, without knowing the decryption key, flipping 1 or more bits in CT1 will flip 1 or more bits in PT2.

Let’s see how that works, where ^ 1 denotes flipping a bit at an arbitrary position.

CT1 ^ 1 + CT2

Which get decrypted into:

TRASH + PT2 ^ 1

A nasty side effect is that we completely trash the decryption of CT1 but, if we know the contents of PT2, we can fully control PT2 to our heart’s delight! All this magic can be attributed to the XOR operation being performed after the AES decryption.

Chaining multiple blocks

We now know how to control a single block decrypted using CBC-mode by trashing another. But what about the rest of the image? Well, once we make peace with the fact that we will never control everything, we can try to control half! If we consider the bit-flipping discussion above, let’s consider the following image encrypted with AES-128-CBC, for which we do not control the IV:

CT1 + CT2 + CT3 + CT4 + ...

Which gets decrypted into:

PT1 + PT2 + PT3 + PT4 + ...

No magic here! All is decrypted as expected. However, once we flip a bit in CT1 , like:

CT1 ^ 1 + CT2 + CT3 + CT4 + ...

Then, on the next decryption, it means we trash PT1 but control PT2 , like:

TRASH + CT2 ^ 1 + PT3 + PT4 + ...

The beauty of CBC-mode is that with the same ease we can provide:

CT1 ^ 1 + CT2 + CT1 ^ 1 + CT2 + ...

Which results in:

TRASH + CT2 ^ 1 + TRASH + CT2 ^ 1 + ...

Using this technique we can construct an image in which we control half of the blocks by only knowing a single plain-text/ciphertext pair! But, this makes you wonder, where can we obtain such a pair? Well, we all know that known data (such as 00s or FFs) is typically appended to images in order to align them to whatever size the developer loves. Or perhaps we know the start of an image! Not completely unlikely when we consider exception vectors, headers, etc. More importantly, it does not matter what block we know, as long as we know a block or more somewhere in the original encrypted image. Now that we cleared this up, let’s see how we can we construct a payload that will correctly execute under these restrictions!

Payload and Image construction

Obviously we want to do something useful; that is, to execute arbitrary code! As an example, we will write some code that prints a string on the serial interface that allows us to identify a successful attack. For the hypothetical target that we have in mind, this can be accomplished by leveraging the function SendChar() that enables us to print characters on the serial interface. This type of functionality is commonly found on embedded devices.

We would like to execute shellcode like the following: beacon out on the UART and let us know that we got code execution, but there’s a bit of a problem.

image

This piece of code spans multiple 16-byte blocks, which is a problem as we only partially control the decrypted image. There will always be a trashed block in between controlled blocks. We mitigate this problem by splitting up the code into snippets of twelve bytes and by adding an additional instruction that jumps over the trashed block to the next controlled block. By inserting place holders for the trash blocks we allow the assembler to fill in the right offset for the next block. Once the code is assembled, we will remove the placeholders!

image
image

Let’s put everything together and write some Python (page 511) to introduce the concept to you in a language we all understand, instead of that most impractical of languages, English. We use a different payload that is easier to comprehend visually. Obviously, nothing prevents you from replacing the actual payload with something useful like the payload described earlier or anything else of your liking!

image

Python to Force a Payload into AES-CBC

image

In a real world scenario it is likely that we do not control the IV. This means, execution starts from the beginning of the image, we’ll need to survive executing the first block which consists of random bytes. This can accomplished by taking the results from PoC GTFO 14:06 (page 66) into account where we showed that surviving the execution of a random 16-byte block is somewhat trivial (at least on ARM). Unless very lucky, we can generate different images with a different first block until we can profit!

We hope the above demonstrates the idea concretely so you can construct your own magic CBC-mode images! :)

Once again we’re reminded that confidentiality is not the same as integrity, none of this would be possible if the integrity of the data is assured. We also, once again, bask in the radiance of the CBC-mode of encryption. We’ve seen that with some very simple operations, and a little knowledge of the plain-text, we can craft half-controlled images. By simply skipping over the non-controllable blocks, we can actually create a fully functional encrypted payload, while having no knowledge of the encryption key. If this doesn’t convince you of the majesty of CBC then nothing will.

image

17:03 In the Company of Rogues: Pastor Laphroaig’s Tall Tales of Science and of Fiction

by Pastor Manul Laphroaig

Gather ’round, neighbors. The time for carols and fireside stories is upon us. So let’s talk about literature, the heart-warming stories of logic, science, and technology. For even though Santa Claus, Sherlock Holmes, and Captain Kirk are equally imaginary, their impact on us was very real, but also very different at the different times of our lives, and we want to give them their due.

Fiction, of course, works by temporary suspension of disbelief in made-up things, people, and circumstances, but some made-up things make us raise our eyebrows higher than others. Still, the weirdest part is that the things that are hard to believe in the same story sometimes change with time!

So I was recently re-reading some Sherlock Holmes stories, and a thought struck me: in the modern world that succeeded Conan Doyle’s London, both Mr. Holmes and Dr. Watson would, in fact, be criminals.

Consider: Holmes’ use of narcotics to stimulate his brain in the absence of a good riddle would surely end up with the modern, scientifically organized police sending him to prison rather than deferentially consulting him on their cases. What’s more, with all his chemical kit and apparatus, they’d be congratulating themselves on a major drug lab bust. Even if Dr. Watson escaped prosecution as an accomplice, he’d likely lose his medical license, at the very least.

Nor would that be Watson’s only problem. Consider his habit of casually sticking his revolver in his coat pocket when going out to confront some shady and violent characters that his friend’s interference with their intended victims would severely upset. This habit would as likely as not land him in serious trouble. His gun crimes were, of course, not as bad as Holmes’—“ ...when Holmes in one of his queer humors would sit in an arm-chair with his hair trigger and a hundred Boxer cartridges, and proceed to adorn the opposite wall with a patriotic V.R. done in bullet pocks,...”—but would be quite enough to put the good doctor away among the very classes of society that Mr. Holmes was so knowledgeable about.

image

I wonder what would surprise Sir Arthur Conan Doyle, KStJ, DL more about our scientific modernity: that an upstanding citizen would need special permission to defend himself with the best mechanical means of the age when standing up for those abused by the violent bullies of the age, or that such citizens would need a license to own a chemistry lab with boiling flasks, Erlenmeyer flasks, adapter tubes, and similar glassware,0 let alone the chemicals.

Just imagine that a few decades from now the least believable part of a Gibson cyberpunk novel might be not the funky virtual reality, but that the protagonist owns a legal debugger. Why, owning a road-worthy military surplus tank sounds less far fetched!

In Conan Doyle’s stories, Mr. Holmes and Dr. Watson represented the best of the science and tech-minded vanguard of their age. Holmes was an applied science polymath, well versed in chemistry, physics, human biology, and innumerable other things. Even his infamous indifference to the Copernican theory1 is likely due to his unwillingness to repeat the dictums that a member of the contemporary good society had to “know,” i.e., know to repeat, without thinking about them first. As for Watson, his devotion to science is seriously underappreciated—just imagine what sort of stinky, loud, and occasionally explosive messes he opted to put up with. It takes a genuine conviction of the value of scientific experiment to do so, his respect for Sherlock notwithstanding.

Just in case you wonder how Watson’s trusty revolver fits into this, remember that in his time it represented the pinnacle of mechanical and chemical engineering, just like rocketry did some half a century later. In fact, the Boxer from a couple of paragraphs back, Col. Edward Mounier Boxer, F.R.S., besides inventing the modern centerfire primer that Holmes used in his Webley to spell Queen Victoria’s initials and that we use to this day in our ammo, also designed an early two-stage rocket. This same principle of rocketry was later used by Robert Hutchings Goddard.

————

But of course times change, and we change with them. So I put that book aside, and opened another, which was rockets and space travel all over: a Heinlein juvenile novel, Rocket Ship Galileo . Heinlein’s juvies are a great way to remind yourself about the basics of space flight and celestial mechanics—but I wish I hadn’t, neighbors, not in the frame of mind I was in.

You see, in this 1947 novel three teenagers, who dabble in rocketry and earn their rocket pilot licenses, are taken to the Moon by their uncle, a nuclear physicist and space flight expert. The only people who try to stop them, under the pretext of “endangering minors,” are actual Nazis—and the local sheriff sees right through them. So The Galileo lifts off to seek adventure and handy explanations of the scientific method, the crowd and the state police cheer, and the stranger with the fake minor protection injunction is taken into custody.

Now that was 1948. Many things changed since then. Vertical landing of space rockets, which made the reader of these juvies cringe just a few years ago, has become a technical reality. But a sheriff approving of a risky activity with mere parental consent is what really stretches belief nowadays; the Moon Nazis with their fake child protection order would’ve won easily.

Granted, juvie fiction is bound to stretch the truth a little, to give teenagers a place in the adult action to aspire to. But this is the kind of a stretch that inspired the first generation of actual NASA engineers. The characters of the former NASA engineer’s memoir Rocket Boys built homemade rockets just like Heinlein’s teen protagonists. Just like Heinlein’s fictional teens, they initially got into trouble for it, and were similarly rescued by adults who used their discretion rather than today’s zero tolerance polices.

image

Now you can read the book or watch the movie, October Sky , and count the felonies a teenager these days would rack up for trying the things that brought the author, Homer H. Hickam, Jr., from a West Virginia coal mining town to NASA.

And speaking of movies, neighbors, do you recall that Star Trek episode, Arena , in which Captain Kirk is dumped on a primitive world and made to fight a hostile reptilian alien? The fight is arranged by a powerful civilization annoyed by Kirk’s and the Gorn’s ships dog-fighting in their space; it somehow fits their sense of justice to reduce a spaceship battle to single combat of the captains. Both combatants are deprived of any familiar tools, but the alien Gorn is much, much stronger, and easily tosses Kirk around.

Of course, all of that was just the setup for a classic story of science education. Kirk saves himself and his ship by spotting the ingredients for making black powder, then using the concoction to disable his scaly, armored opponent closing for the kill.

I wonder, though: would the black powder hack have occurred so easily to Kirk if he—and the screenwriters, and a significant part of the 1960s audience expected to appreciate the trick— hadn’t as teenagers experimented with making things go boom? And, if they hadn’t, would there even be a Star Trek—and the space program?

Such skills used to be synonymous with basic science training. Now, for all practical purposes, they are synonymous with school suspension if you are lucky, or a criminal record if you aren’t.

Think about the irony of this, neighbors. The enlightened opinion of our age is all about the virtues of STEM, but it punishes with a heavy hand exactly those interests that propelled the actual science and technology, because they could be dangerous. And what’s dangerous must be banned, and children must be taught to fear and shun it, from grade school onward.

How did we come to this?

Somewhere along the way of technological progress we have picked up a fallacy that grew and grew, until it became the default way of thinking—so entrenched that one needs an effort to nail it down explicitly, in so many words.

It is the idea that progress somehow means and requires banning or suppressing the dangerous things, the risky things, the tools that could be abused to cause harm. If the tool and the skill are too useful to be expunged entirely, they must be limited to special people who have superior abilities, and who are emphatically not you.

image

Verily I tell you, neighbors: although it may feel fine to suffer the ban on a tool or a skill that neither you nor anyone you know cares to use, it is not progress you are getting this way; it is the very opposite. For when some tools are deemed to be too powerful and too dangerous to be left in your hands, the same fallacy will come for your actual favorite tools, and sooner than you think. The folks inclined to listen to your explanations of why your tools are not evil will be too few and far between.

Knowledge is power, “Scientia potentia est.” Power, by definition, is dangerous and can be misused. When the possibility of misuse gets to be enough grounds for banning a technology to the public, it’s only a matter of time till you are deemed unworthy to wield the power of knowledge without permission. Good luck with hoping that the bureaucracy set up to manage these permissions will be sympathetic towards your interests.

And then, of course, the well-meaning community leaders, law-makers, and officials will wonder why people’s interest in their approved version of STEM is lacking, despite all the glossy pictures of happy kids and smiling adult models doing something vaguely scientific against the background of some generic lab equipment. It doesn’t really take long for kids to learn that looking for potentia in scientia means trouble; and who cares for scientia that is not potentia?

Open a newspaper, neighbors, and you will see a lot of folks calling each other “anti-science,” as one of the worst possible pejoratives. Yet I wonder: what harms science more than banning its basic technological artifacts from common use, be they mechanical, chemical, electronic, or even mathematical?2 And, should it come to calling the shots on banning things, would you rather have the people who proclaim the importance of science but have zero interest in tinkering with its actual artifacts, or the actual tinkerers who obsessively fix cars, hand-load ammo, or write programs?

The world has become a much stranger place since the time when our classic tales of logic, science, and technology were written. We will yet have to explain again and again that doctors don’t cause epidemics,3 that engineers don’t cause murder or terrorism; and that hackers do not cause computer crime.

Yet through all of this, may we remember to keep building our own bird feeders, and to let our neighbors build theirs, even when we disapprove of theirs just as they might disapprove of ours. For this is the only way for progress to happen: in freedom and by regular, non-special people making risky things that have power and learning to make them better. Thus and only thus do the tall tales of science and technology come true. Amen.

17:04 Sniffing BTLE with the Micro:Bit

by Damien Cauquil

Howdy y’all!

It’s well known that sniffing Bluetooth Low Energy communications is a pain in the bottom, unless you have specialty tools like the Ubertooth One and its competitors. During my exploration of the BBC Micro:Bit, I discovered the very interesting fact that it may be used to sniff BLE communications.

The BBC Micro:Bit is a small device based on a nRF51822 transceiver made by Nordic Semiconductor, with a 5 × 5 LED screen and two buttons that can be powered by two AAA batteries. The nRF51822 is able to communicate over multiple protocols: Enhanced ShockBurst (ESB), ShockBurst (SB), GZLL, and Bluetooth Low Energy (BLE).

Nordic Semiconductor provides its own implementation of a Bluetooth Low Energy stack, released in what they call a Soft-Device and a well-known closed-source sniffing firmware used in Adafruit’s BlueFriend LE sniffer for instance. That doesn’t help that much, as this firmware relies on BLE connection requests to start following a specific connection, and not on packets exchanged between two devices in an existing connection. So, I found no way to cheaply sniff an existing BLE connection.

In this short article, I’ll describe how to implement a Bluetooth Low Energy sniffer as software on the BBC Micro:Bit that can follow pre-existing connection despite channel hopping. In cases where channel remapping is in use, it can sniff connections on which even the Ubertooth currently fails.

image

The Goodspeed Way of Sniffing

The Micro:Bit being built upon a nRF51822, it ignited a sparkle in my mind as I remembered the hack found by our great neighbor Travis Goodspeed who managed to turn another Nordic Semiconductor transceiver (nRF24L01+) into a sniffer.0 I was wondering if by any chance this nRF51822 would have been prone to the same error, and therefore could be turned into a BLE sniffer.

It took me hours to figure out how to reproduce this exploit on this chip, but in fact it works exactly the same way as described in Travis’ paper. Since the nRF51822 is a lot different than the nRF24L01+ (as it includes its own CPU rather being driven by a SPI bus), we must change multiple parameters in order to sniff BLE packets over the air.

First, we need to enable the processor high frequency clock because it is required before enabling the RADIO module of the nRF51822. This is done with the following code.

image

Then, we must specify the mode, addresses, power and frequency our nRF51822 will be tuned to.

image

The trick here, as described in Travis’ paper, is to use an address length of two bytes instead of the five bytes expected by the chip. The address length is stored in a configuration register called PCNF0 , along with other extra parameters. The PCNF0 and PCNF1 registers define the way the nRF51822 will behave: its endianness, the expected payload size, the address size and much more documented in the nRF51 Series Reference Manual.1

The following lines of code configure the nRF51822 to use a two-byte address, big-endian with a maximum payload size of 10 bytes.

image

Eventually, we have to disable the CRC computation in order to make the chip consider any data received as valid.

image

Identifying BLE Connections

With this setup, we can now receive crappy data from the 2.4GHz bandwidth and hopefully some BLE packets. The problem is now to find the needle in the haystack, that is a valid BLE packet in the huge amount of data received by our nRF51822.

A BLE packet starts with an access address, a 32-bit carefully-chosen value that uniquely identifies a link between two BLE devices, as specified in the Bluetooth 4.2 Core Specifications document. This access address is followed by some PDU and a 3-byte CRC, but this CRC value is computed from a CRCInit value that is unique and associated with the connection. The BLE packet data is whitened in order to make it more tamper-resistant, and should be dewhitened before processing. If the connection is already initiated, as it is our case, the PDU is a Data Channel PDU with a specific two-byte header, as stated in the Bluetooth Low Energy specifications.

image

Figure 2.13: Data channel PDU header

When a BLE connection is established, keep-alive packets with a size of 0 bytes are exchanged between devices.

Again, we follow the same methodology as Travis’ by listing all the candidate access addresses we get, and identifying the redundant ones. This is the same method chosen by Mike Ryan in its Ubertooth BTLE tool from WOOT13,2 with a nifty trick: we determine a valid access address based on the number of times we have seen it combined with a filter on its dewhitened header. We may also want to rely on the way the access address is generated, as the core specifications give a lot of extra constraints access address must comply with, but it is not always followed by the different implementations of the Bluetooth stack.

Once we found a valid access address, the next step consists in recovering the initial CRC value which is required to allow the nRF51822 to automatically check every packet CRC and let only the valid ones go through. This process is well documented in Mike Ryan’s paper and code, so we won’t repeat it here.

With the correct initial CRC value and access address in hands, the nRF51822 is able to sniff a given connection’s packets, but we still have a problem. The BLE protocol implements a basic channel hopping mechanism to avoid sniffing. We cannot sit on a channel for a while without missing packets, and that’s rather inconvenient.

Following the Rabbit

The Bluetooth Low Energy protocol defines 37 different channels to transport data. In order to communicate, two devices must agree on a hopping sequence based on three characteristics: the hop interval, the hop increment, and the channel map.

The first one, the hop interval, is a value specifying the amount of time a device should sit on a channel before hopping to the next one. The hop increment is a value between 5 and 16 that specifies the number of channels to add to the current one (modulo the number of used channels) to get the next channel in the sequence. The last one may be used by a connecting device to restrict the channels used to the ones given in a bitmap. The channel map was quite a surprise for me, as it isn’t mentioned in Ubertooth’s BTLE documentation.3

image

Figure 17.24: Hopping Algorithm

We need to know these values in order to capture every possible packets belonging to an active connection, but we cannot get them directly as we did not capture the connection request where we would find them. We need to deduce these values from captured packets, as we did for the CRC initial value. In order to find out our first parameter, the hop interval, Mike Ryan designed the simplest algorithm that could be: measuring the time between two packets received on a specific channel and dividing it by the number of channels used, i.e. 37. So did I, but my measures did not seem really accurate, as I got two distinct values rather than a unique one. I was puzzled, as it would normally have been straightforward as the algorithm is simple as hell. The only explanation was that a valid packet was sent twice before the end of the hopping cycle, whereas it should only have been sent once. There was something wrong with the hopping cycle.

It seems Mike Ryan made an assumption that was correct in 2013 but not today in 2017. I checked the channels used by my connecting device, a Samsung smartphone, and guess what? It was only using 28 channels out of 37, whereas Mike assumed all 37 data channels will be used. The good news is that we now know the channel map is really important, but the bad news is that we need to redesign the connection parameters recovery process.

Improving Mike Ryan’s Algorithm

First of all, we need to determine the channels in use by listening successively on each channel for a packet with our expected access address and a valid CRC value. If we get no packet during a certain amount of time, then this channel is not part of the hopping sequence. Theoretically, this may take up to four seconds per channel, so not more than three minutes to determine the channel map. This is a significant amount of time, but luckily devices generally use more than half of the available channels so it would be quicker.

Once the channel map is recovered, we need to determine precisely the hop interval value associated with the target connection. We may want our sniffer to sit on a channel and measure the time between two valid packets, but we have a problem: if less than 37 channels are used, one or more channels may be reused to fill the gaps. This behavior is due to a feature called “channel remapping” that is defined in the Bluetooth Low Energy specifications, which basically replace an unused channel by another taken from the channel map. It means a channel may appear twice (or more) in the hopping sequence and therefore compromise the success of Mike’s approach.

image

A possible workaround involves picking a channel that appears only once in the hopping sequence, whatever the hop increment value. If we find such a channel, then we just have to measure the time between two packets, and divide this value by 37 to recover the hop interval value. The algorithm in Figure 17.24 may be used to pick this channel.

This algorithm finds a unique channel only if more than the half of the data channels are used, and may possibly work for a fewer number of channels depending on the hop increment value. This quick method doesn’t require a huge amount of packets to guess the hop interval.

The last parameter to recover is the hop increment, and Mike’s approach is also impacted by the number of channels in use. His algorithm measures the time between a packet on channel 0 and channel 1, and then relies on a lookup table to determine the hop increment used. The problem is, if channel 1 appears twice then the measure is inaccurate and the resulting hop increment value guessed wrong.

image

Figure 17.25: Channel Lookup Table

Again, we need to adapt this algorithm to a more general case. My solution is to pick a second channel derived from the first one we have already chosen to recover the hop interval value, for which the corresponding lookup table only contains unique values. The lookup table is built as shown in Figure 17.25.

Eventually, we try every possible combination and only keep one that does not contain duplicate values, as shown in Figure 17.26.

image

Figure 17.26: Picking the Second Channel

Last but not least, in Figure 17.27 we build the lookup table from these two carefully chosen channels, if any. This lookup table will be used to deduce the hop increment value from the time between these two channels.

Patching BBC Micro:Bit

Thanks to the designers of the BBC Micro:Bit, it is possible to easily develop on this platform in C and C ++ . Basically, they wrote a Device Abstraction Layer4 that provides everything we need except the radio, as they developed their own custom protocol derived from Nordic Semiconductor ShockBurst protocol. We must get rid of it.

I removed all the useless code from this abstraction layer, the piece of code in charge of handling every packet received by the RADIO module of our nRF51822 in particular. I then substitute this one with my own handler, in order to perform all the sniffing without being annoyed by some hidden third-party code messing with my packets.

image
image

Figure 17.27: Deducing the Hop Increment

Eventually, I coded a specific firmware for the BBC Micro:Bit that is able to communicate with a Python command-line interface, and that can be used to detect and sniff existing connections. This is not perfect and still a work in progress, but it can passively sniff BLE connections. Of course, it may lack the legacy sniffing method based on capturing connection requests; that will be implemented later.

This tiny tool, dubbed ubitle, is able to enumerate every active Bluetooth Low Energy connections.

image

It is also able to recover the channel map used by a given connection, as well as its hop interval and increment.

image

Once all the parameters recovered, it may also dump traffic to a PCAP file.

image

The resulting PCAP file may be opened in Wireshark to dissect the packets. You may notice the keep-alive packets are missing from this capture. It is deliberate; these packets are useless when analyzing Bluetooth Low Energy communications.

Source code

The source code of this project is available on Github under GPL license, feel free to submit bugs and pull requests.5

This tool does not support dynamic channel map update or connection request based sniffing, which are implemented in Nordic Semiconductor’s closed source sniffer. It’s PoC GTFO so take my little tool as it is: a proof of concept demonstrating that it is possible to passively sniff BLE connections for less than twenty bucks, with a device one may easily find on the Internet.

image

17:05 Bit-Banging Ethernet

by Andrew D. Zonenberg, because real hackers need neither PHYs nor NICs!

If you’re reading this, you’ve almost certainly used Ethernet on a PC by means of the BSD sockets API. You’ve probably poked around a bit in Wireshark and looked at the TCP/IP headers on your packets. But what happens after the kernel pushes a completed Ethernet frame out to the network card?

A PC network card typically contains three main components. These were separate chips in older designs, but many modern cards integrate them all into one IC. The bus controller speaks PCIe, PCI, ISA, or some other protocol to the host system, as well as generating interrupts and handling DMA. The MAC (Media Access Controller) is primarily responsible for adding the Ethernet framing to the outbound packet. The MAC then streams the outbound packet over a “reconciliation sublayer” interface to the PHY (physical layer), which converts the packet into electrical or optical impulses to travel over the cabling. This same process runs in the opposite direction for incoming packets.

In an embedded microcontroller or SoC platform, the bus controller and MAC are typically integrated on the same die as the CPU, however the PHY is typically a separate chip. FPGA-based systems normally implement a MAC on the FPGA and connect to an external PHY as well; the bus controller may be omitted if the FPGA design sends data directly to the MAC. Although the bus controller and its firmware would be an interesting target, this article focuses on the lowest levels of the stack.

MII and Ethernet framing

The reconciliation sublayer is the lowest (fully digital) level of the Ethernet protocol stack that is typically exposed on accessible PCB pins. For 10/100 Ethernet, the base protocol is known as MII (Media Independent Interface). It consists of seven digital signals each for the TX and RX buses: a clock (2.5 MHz for 10Base-T, 25 MHz for 100Base-TX), a data valid flag, an error flag, and a 4-bit parallel bus containing one nibble of packet data. Other commonly used variants of the protocol include RMII (reduced-pin MII, a double-data-rate version, which uses less pins), GMII (gigabit MII, that increases the data width to 8 bits and the clock to 125 MHz), and RGMII (a DDR version of GMII using less pins). In all of these interfaces, the LSB of the data byte/nibble is sent on the wire first.

An Ethernet frame at the reconciliation sublayer consists of a preamble (seven bytes of 0x55 ), a start frame delimiter (SFD, one byte of 0xD5 ), the 6-byte destination and source MAC addresses, a 2-byte EtherType value indicating the upper layer protocol (for example 0x0800 for IPv4 or 0x86DD for IPv6), the packet data, and a 32-bit CRC-32 of the packet body (not counting preamble or SFD). The byte values for the preamble and SFD have a special significance that will be discussed in the next section.

10Base-T Physical Layer

The simplest form of Ethernet still in common use is known as 10Base-T (10 Mbps, baseband signaling, twisted pair media). It runs over a cable containing two twisted pairs with 100 ohm differential impedance. Modern deployments typically use Cat5 cabling, which contains four twisted pairs. The orange and green pairs are used for data (one pair in each direction), while the blue and brown pairs are unused.

image

Figure 17.28: 10Base-T Waveform

image

When the line is idle, there is no voltage difference between the positive (white with stripe) and negative (solid colored) wires in the twisted pair. To send a 1 or 0 bit, the PHY drives 2.5V across the pair; the direction of the difference indicates the bit value. This technique allows the receiver to reject noise coupled into the signal from external electromagnetic fields: since the two wires are very close together the induced voltages will be almost the same, and the difference is largely unchanged.

Unfortunately, we cannot simply serialize the data from the MII bus out onto the differential pair; that would be too easy! Several problems can arise when connecting computers (potentially several hundred feet apart) with copper cables. First, it’s impossible to make an oscillator that runs at exactly 20 MHz, so the oscillators providing the clocks to the transmit and receive NIC are unlikely to be exactly in sync. Second, the computers may not have the same electrical ground. A few volts offset in ground between the two computers can lead to high current flow through the Ethernet cable, potentially destroying both NICs.

In order to fix these problems, an additional line coding layer is used: Manchester coding. This is a simple 1:2 expansion that replaces a 0 bit with 01 and a 1 bit with 10, increasing the raw data rate from 10 Mbps (100 ns per bit) to 20 Mbps (50 ns per bit). This results in a guaranteed 1–0 or 0–1 edge for every data bit, plus sometimes an additional edge between bits.

Since every bit has a toggle in the middle of it, any 100 ns period without one must be the space between bits. This allows the receiver to synchronize to the bit stream; and then the edge in the middle of each bit can be decoded as data and the receiver can continually adjust its synchronization on each edge to correct for any slight mismatches between the actual and expected data rate. This property of Manchester code is known as self clocking.

Another useful property of the Manchester code is that, since the signal toggles at a minimum rate of 10 MHz, we can AC couple it through a transformer or (less commonly) capacitors. This prevents any problems with ground loops or DC offsets between the endpoints, as only changes in differential voltage pass through the cables.

We now see the purpose of the 55 55 ... D5 preamble: the 0x55 ’s provide a steady stream of meaningless but known data that allows the receiver to synchronize to the bit clock, then the 0xD5 has a single bit flipped at a known position. This allows the receiver to find the boundary between the preamble and the packet body.

That’s it! This is all it takes to encode and decode a 10Base-T packet. Figure 17.28 shows what this waveform actually looks like on an oscilloscope.

One last bit to be aware of is that, in between packets, a link integrity pulse (LIT) is sent every 16 milliseconds of idle time. This is simply a +2.5V pulse about 100 ns long, to tell the remote end, “I’m still here.” The presence or absence of LITs or data traffic is how the NIC decides whether to declare the link up.

By this point, dear reader, you’re probably thinking that this doesn’t sound too hard to bit-bang — and you’d be right! This has in fact been done, most notably by Charles Lohr on an AT-Tiny microcontroller.0 All you need is a pair of 2.5V GPIO pins to drive the output, and a single input pin.

100Base-TX Physical Layer

The obvious next question is, what about the next step up, 100Base-TX Ethernet? A bit of Googling failed to turn up anyone who had bit-banged it. How hard can it really be? Let’s take a look at this protocol in depth!

First, the two ends of the link need to decide what speed they’re operating at. This uses a clever extension of the 10Base-T LIT signaling: every 16 ms, rather than sending a single LIT, the PHY sends 17 pulses – identical to the 10Base-T LIT, but renamed fast link pulse (FLP) in the new standard – at 125 µs spacing. Each pair of pulses may optionally have an additional pulse halfway between them. The presence or absence of this additional pulse carries a total of 16 bits of data.

Since FLPs look just like 10Base-T LITs, an older PHY which does not understand Ethernet auto-negotiation will see this stream of pulses as a valid 10Base-T link and begin to send packets. A modern PHY will recognize this and switch to 10Base-T mode. If both ends support autonegotiation, they will exchange feature descriptors and switch to the fastest mutually-supported operating mode.

Figure 17.29 shows an example auto-negotiation frame. The left five data bits indicate this is an 802.3 base auto-negotiation frame (containing the feature bitmask); the two 1 data bits indicate support for 100Base-TX at both half and full duplex.

Supposing that both ends have agreed to operate at 100Base-TX, what happens next? Let’s look at the journey a packet takes, one step at a time from the sender’s MII bus to the receiver’s.

First, the four-bit nibble is expanded into five bits by a table lookup. This 4B/5B code adds transitions to the signal just like Manchester coding, to facilitate clock synchronization at the receiver. Additionally, some additional codes (not corresponding to data nibbles) are used to embed control information into the data stream. These are denoted by letters in the standard.

image

Figure 17.29: Autonegotiation Frame

image

Figure 17.30: MLT-3 Waveform

The first two nibbles of the preamble are then replaced with control characters J and K. The remaining nibbles in the preamble, SFD, packet, and CRC are expanded to their 5-bit equivalents. Control characters T and R are appended to the end of the packet. Finally, unlike 10Base-T, the link does not go quiet between packets; instead, the control character I (idle) is continuously transmitted.

The encoded parallel data stream is serialized to a single bit at 125 Mbps, and scrambled by XORing it with a stream of pseudorandom bits from a linear feedback shift register, using the polynomial x1 1 + x9 + 1 . If the data were not scrambled, patterns in the data (especially the idle control character) would result in periodic signals being driven onto the wire, potentially causing strong electromagnetic interference in nearby equipment. By scrambling the signal these patterns are broken up, and the radiated noise emits weakly across a wide range of frequencies rather than strongly in one.

Finally, the scrambled data is transmitted using a rather unusual modulation known as MLT-3. This is a pseudo-sine waveform which cycles from 0V to +1V, back to 0V, down to 1V, and then back to 0 again. To send a 1 bit the waveform is advanced to the next cycle; to send a 0 bit it remains in the current state for 8 nanoseconds. Figure 17.30 is an example of MLT-3 coded data transmitted by one of my Cisco switches, after traveling through several meters of cable.

MLT-3 is used because it is far more spectrally efficient than the Manchester code used in 10Base-T. Since it takes four 1 bits to trigger a full cycle of the waveform, the maximum frequency is 1/4 of the 125 Mbps line rate, or 31.25 MHz. This is only about 1.5 times higher than the 20 MHz bandwidth required to transmit 10Base-T, and allows 100Base-TX to be transmitted over most cabling capable of carrying 10Base-T.

The obvious question is, can we bit-bang it? Certainly! Since I didn’t have a fast enough MCU, I built a test board (Figure 17.31) around an old Spartan-6 FPGA left over from an abandoned project years ago.

image

Figure 17.31: Spartan-6 Test Board

image

Figure 17.32: TRAGICLASER Block Diagram

Bit-Banging 100Base-TX

A block diagram of the PHY, randomly code-named TRAGICLASER by @NSANameGen,1 is shown in Figure 17.32.

The transmit-side 4B/5B coding, serializing, and LFSR scrambler are straightforward digital logic at moderate to slow clock rates in the FPGA, so we won’t discuss their implementation in detail.

Generating the signal requires creating three differential voltages: 0, +1 , and 1. Since most FPGA I/O buffers cannot operate at 1.0V, or output negative voltages, a bit of clever circuitry is required.

We use a pair of 1K ohm resistors to bias the center tap of the output transformer to half of the 3.3V supply voltage (1.65V). The two ends of the transformer coil are connected to FPGA I/O pins. Since each I/O pin can pull high or low, we have a form of the classic H-bridge motor driver circuit. By setting one pin high and the other low, we can drive current through the line in either direction. By tri-stating both pins and letting the terminating resistor dissipate any charge built up in the cable capacitance, we can create a differential 0 state.

Since we want to drive + / 1V rather than 3.3V, we need to add a resistor in series with the FPGA pins to reduce the drive current such that the receiver sees 1V across the 100 ohm terminator. Experimentally, good results were obtained with 100 ohm resistors in series with a Spartan-6 FPGA pin configured as LVCMOS33, fast slew, 24 mA drive. For other FPGAs with different drive characteristics, the resistor value may need to be slightly adjusted. This circuit is shown in Figure 17.33.

This produced a halfway decent MLT-3 waveform, and one that would probably be understood by a typical PHY, but the rise and fall times as the signal approached the 0V state were slightly slower than the 5 ns maximum permitted by the 802.3 standard (see Figure 17.34).

image

Figure 17.33: H-Bridge Schematic

The solution to this is a clever technique from the analog world known as pre-emphasis. This is a fancy way of saying that you figure out what distortions your signal will experience in transit, then apply the reverse transformation before sending it. In our case, we have good values when the signal is stable but during the transitions to zero there’s not enough drive current. To compensate, we simply need to give the signal a kick in the right direction.

Luckily for us, 10Base-T requires a pretty hefty dose of drive current. In order to ensure we could drive the line hard enough, two more FPGA pins were connected in parallel to each side of the TX-side transformer through 16-ohm resistors. By paralleling these two pins, the available current is significantly increased.

After a bit of tinkering, I discovered that by configuring one of the 10Base-T drive pins as LVCMOS33, slow slew, 2 mA drive, and turning it on for 2 nanoseconds during the transition from the +/ 1 state to the 0 state, I could provide just enough of a shove that the signal reached the zero mark quickly while not overshooting significantly. Since the PHY itself runs at only 125 MHz, the Spartan-6 OSERDES2 block was used to produce a pulse lasting 1/4 of a PHY clock cycle. Figure 17.35 shows the resulting waveforms.2

image

Figure 17.34: Halfway-Decent Waveform

image

Figure 17.35: Waveform using Premphasis

At this point sending the auto-negotiation waveforms is trivial: The other FPGA pin connected to the 16 ohm resistor is turned on for 100 ns, then off. With a Spartan-6 I had good results with LVCMOS33, fast slew, 24 mA drive for these pins. If additional drive strength is required the pre-emphasis drivers can be enabled in parallel, but I didn’t find this to be necessary in my testing.

These same pins could easily be used for 10Base-T output as well (to enable a dual-mode 10/100 PHY) but I didn’t bother to implement this. People have already demonstrated successful bitbanging of 10Base-T, and it’s not much of a POC if the concept is already proven.

That’s it, we’re done! We can now send 100Base-TX signals using six FPGA pins and six resistors!

Decoding 100Base-TX

Now that we can generate the signals, we have to decode the incoming data from the other side. How can we do this?

Most modern FPGAs are able to accept differential digital inputs, such as LVDS, using the I/O buffers built into the FPGA. These differential input buffers are essentially comparators, and can be abused into accepting analog signals within the operating range of the FPGA.

By connecting an input signal to the positive input of several LVDS input buffers, and driving the negative inputs with an external resistor ladder, we can create a low-resolution flash ADC! Since we only need to distinguish between three voltage levels (there’s no need to distinguish the +1 and +2. 5, or 1 and −2.5, states as they’re never used at the same time) we can use two comparators to create an ADC with approximately 1.5 bit resolution.

There’s just one problem: this is a single-ended ADC with an input range from ground to Vdd, and our incoming signal is differential with positive and negative range. Luckily, we can work around this by tying the center tap of the transformer to 1.65V via equal valued resistors to 3.3V and ground, thus biasing the signal into the 0–3.3V range. See Figure 17.36.

After we connect the required 100 ohm terminating resistor across the transformer coil, the voltages at the positive and negative sides of the coil should be equally above and below 1.65V. We can now connect our ADC to the positive side of the coil only, ignoring the negative leg entirely aside from the termination.

image

Figure 17.36: Biasing Schematic

The ADC is sampled at 500 Msps using the Spartan-6 ISERDES. Since the nominal data rate is 125 Mbps, we have four ADC samples per unit interval (UI). We now need to recover the MLT-3 encoded data from the oversampled data stream.

The MLT-3 decoder runs at 125 MHz and processes four ADC samples per cycle. Every time the data changes the decoder outputs a 1 bit. Every time the data remains steady for one UI, plus an additional sample before and after, the decoder outputs a 0 bit. (The threshold of six ADC samples was determined experimentally to give the best bit error rate.) The decoder nominally outputs one data bit per clock however due to jitter and skew between the TX and RX clocks, it occasionally outputs zero or two bits.

The decoded data stream is then deserialized into 5-bit blocks to make downstream processing easier. Every 32 blocks, the last 11 bits from the MLT-3 decoder are complemented and loaded into the LFSR state. Since the 4B/5B idle code is 0x1F (five consecutive 1 bits), the complement of the scrambled data between packets is equal to the scrambler PRNG output. An LFSR leaks 1 bit of internal state per output bit, so given N consecutive output bits from a N-bit LFSR, we can recover the entire state. The interval of 32 blocks (160 bits) was chosen to be relatively prime to the 11-bit LFSR state size.

After the LFSR is updated, the receiver begins XOR-ing the scrambler output with the incoming data stream and checks for nine consecutive idle characters (45 bits). If present, we correctly guessed the location of an inter-packet gap and are locked to the scrambler, with probability 1 − (2−45) of a false lock due to the data stream coincidentally matching the LFSR output. If not present, we guessed wrong and re-try every 32 data blocks until a lock is achieved. Since 100Base-TX specifies a minimum 96-bit inter-frame gap, and we require 45 + 11 = 56 idle bits to lock, we should eventually guess right and lock to the scrambler.

Once the scrambler is locked, we can XOR the scrambler output (5 bits at a time) with the incoming 5-bit data stream. This gives us cleartext 4B/5B data, however we may not be aligned to code-word boundaries. The idle pattern doesn’t contain any bit transitions so there’s no clues to alignment there. Once a data frame starts, however, we’re going to see a J+K control character pair ( 11000 10001 ). The known position of the zero bits allows us to shift the data by a few bits as needed to sync to the 4B/5B code groups.

Decoding the 4B/5B is a simple table lookup that outputs 4-bit data words. When the J+K or T+R control codes are seen, a status flag is set to indicate the start or end of a packet.

If an invalid 5-bit code is seen, an error counter is incremented. Sixteen code errors in a 256-codeword window, or four consecutive packet times without any inter-frame gap, indicate that we may have lost sync with the incoming data or that the cable may have been unplugged. In this case, we reset the entire PHY circuit and attempt to re-negotiate a link.

image

The final 4-bit data stream may not be running at exactly the same speed as the 25 MHz MII clock, due to differences between TX and RX clock domains. In order to rate match, the 4-bit data coming off the 4B/5B decoder (excluding idle characters) is fed into an 32-nibble FIFO. When the FIFO reaches a fill of 16 nibbles (8 bytes), the PHY begins to stream the inbound packet out to the MII bus. We can thus correct for small clock rate mismatches, up to the point that the FIFO underflows or overflows during one packet time.

Test Results

In my testing, the TRAGICLASER PHY was able to link up with both my laptop and my Cisco switch with no issues through an approximately 2-meter patch cable. No testing with longer cables was performed because I didn’t have anything longer on hand; however, since the signal appears to pass the 802.3 eye mask I expect that the transmitter would be able to drive the full 100m cable specified in the standard with no difficulties. The receiver would likely start to fail with longer cables since I’m not doing equalization or adaptive thresholding, however I can’t begin to guess how much you could get actually away with. If anybody decides to try, I’d love to hear your results!

My test bitstream doesn’t include a full 10/100 MAC, so verification of incoming data from the LAN was conducted with a logic analyzer on the RX-side MII bus. (Figure 17.37.)

The transmit-side test sends a single hard-coded UDP broadcast packet in a loop. I was able to pick it up with Wireshark (Figure 17.38) and decode it. My switch did not report any RX-side CRC errors during a 5-minute test period sending at full line rate.

image

Figure 17.37: Receiver Verification

In my test with default optimization settings, the PHY had a total area of 174 slices, 767 LUT6s, and 8 LUTRAMs as well as four OSERDES2 and two ISERDES2 blocks. This is approximately 1/ 4 of the smallest Spartan-6 FPGA (XC6SLX4) so it should be able to comfortably fit into almost any FPGA design. Additionally, twelve external resistors and an RJ-45 jack with integrated isolation transformer were required.

Further component reductions could be achieved if a 1.5 or 1.8V supply rail were available on the board, which could be used (along with two external resistors) to inject the DC bias into the coupling transformer taps at a savings of two resistors. An enterprising engineer may be tempted to use the internal 100 ohm differential terminating resistors on the FPGA to eliminate yet another passive at the cost of two more FPGA pins, however I chose not to go this route because I was concerned that dissipating 10 mW in the input buffer might overheat the FPGA.

Overall, I was quite surprised at how well the PHY worked. Although I certainly hoped to get it to the point that it would be able to link up with another PHY and send packets, I did not expect the TX waveform to be as clean as it was. Although the RX likely does not meet the full 802.3 sensitivity requirements, it is certainly good enough for short-range applications. The component cost and PCB space used by the external passives compare favorably with an external 10/100 PHY if standards compliance or long range are not required.

Source code is available in my Antikernel project.3

image

Figure 17.38: Our Transmitted Packet in Wireshark

image

17:06 The DIP Flip Whixr Trick: An Integrated Circuit That Functions in Either Orientation

by Joe Grand (Kingpin)

Hardware trickery comes in many shapes and sizes: implanting add-on hardware into a finished product, exfiltrating data through optical, thermal, or electromagnetic means, injecting malicious code into firmware, BIOS, or microcode, or embedding Trojans into physical silicon. Hackers, governments, and academics have been playing in this wide open field for quite some time and there’s no sign of things slowing down.

This PoC, inspired by my friend Whixr of #tymkrs , demonstrates the feasibility of an IC behaving differently depending on which way it’s connected into the system. Common convention states that ICs must be inserted in their specified orientation, assisted by the notch or key on the device identifying pin 1, in order to function properly.

So, let’s defy this convention!

————

Most standard chips, like digital logic devices and microcontrollers, place the power and ground connections at corners diagonal from each other. If one were to physically rotate the IC by 180 degrees, power from the board would connect to the ground pin of the chip or vice versa. This would typically result in damage to the chip, releasing the magic smoke that it needs to function. The key to this PoC was finding an IC with a more favorable pin configuration.

While searching through microcontroller data sheets, I came across the Microchip PIC12F629. This particular 8-pin device has power and GPIO (General Purpose I/O) pins in locations that would allow the chip to be rotated with minimal risk. Of course, this PoC could be applied to any chip with a suitable pin configuration.

In the pinout drawing, which shows the chip from above in its normal orientation, arrows denote the alternate functionality of that particular pin when the chip is rotated around. Since power (VDD) is normally connected to pin 1 and ground (VSS) is normally connected to pin 8, if the chip is rotated, GP2 (pin 5) and GP3 (pin 4) would connect to power and ground instead. By setting both GP2 and GP3 to inputs in firmware and connecting them to power and ground, respectively, on the board, the PIC will be properly powered regardless of orientation.

————

I thought it would be fun to change the data that the PIC sends to a host PC depending on its orientation.

On power-up of the PIC, GP1 is used to detect the orientation of the device and set the mode accordingly. If GP1 is high (caused by the pull-up resistor to VCC), the PIC will execute the normal code. If GP1 is low (caused by the pull-down resistor to VSS), the PIC will know that it has been rotated and will execute the alternate code. This orientation detection could also be done using GP5, but with inverted polarity.

The PIC’s UART (asynchronous serial) output is bit-banged in firmware, so I’m able to reconfigure the GPIO pins used for TX and RX (GP0 and GP4) on-the-fly. The TX and RX pins connect directly to an Adafruit FTDI Friend, which is a standard FTDI FT232R-based USB-to-serial adapter. The FTDI Friend also provides 5V (VDD) to the PoC.

In normal operation, the device will look for a key press on GP4 from the FTDI Friend’s TX pin and then repeatedly transmit the character ‘A’ at 9600 baud via GP0 to the FTDI Friend’s RX pin. When the device is rotated 180 degrees, the device will look for a key press on GP0 and repeatedly transmit the character ‘B ’ on GP4. As a key press detector, instead of reading a full character from the host, the device just looks for a high-to-low transition on the PIC’s currently configured RX pin. Since that pin idles high, the start bit of any data sent from the FTDI Friend will be logic low.

image
image
image

For your viewing entertainment, a demonstration of my breadboard prototype can be found on Youtube.0 Complete engineering documentation, including schematic, bill-of-materials, source code, and layout for a small circuit board module are also available.1

Let this PoC serve as a reminder that one should not take anything at face value. There are an endless number of ways that hardware, and the electronic components within a hardware system, can misbehave. Hopefully, this little trick will inspire future hardware mischief and/or the development of other sneaky circuits. If nothing else, you’re at least armed with a snarky response for the next time some over-confident engineer insists ICs will only work in one direction!

17:07 Injecting shared objects on FreeBSD with libhijack.

by Shawn Webb

In the land of red devils known as Beasties exists a system devoid of meaningful exploit mitigations. As we explore this vast land of opportunity, we will meet our ELFish friends, [p]tracing their very moves in order to hijack them. Since unprivileged process debugging is enabled by default on FreeBSD, we can abuse ptrace to create anonymous memory mappings, inject code into them, and overwrite PLT/GOT entries.0 We will revive a tool called libhijack to make our nefarious activities of hijacking ELFs via ptrace relatively easy.

Nothing presented here is technically new. However, this type of work has not been documented in this much detail, so here I am, tying it all into one cohesive work. In Phrack 56:7, Silvio Cesare taught us fellow ELF research enthusiasts how to hook the PLT/GOT.1 Phrack 59:8, on Runtime Process Infection, briefly introduces the concept of injecting shared objects by injecting shellcode via ptrace that calls dlopen() .2 No other piece of research, however, has discovered the joys of forcing the application to create anonymous memory mappings from which to inject code.

image

This is only part one of a series of planned articles that will follow libhijack’s development. The end goal is to be able to anonymously inject shared objects. The libhijack project is maintained by the SoldierX community.

Previous Research

All prior work injects code into the stack, the heap, or existing executable code. All three methods create issues on today’s systems. On AMD64 and ARM64, the two architectures libhijack cares about, the stack is non-executable by default. The heap implementation on FreeBSD, jemalloc creates non-executable mappings. Obviously overwriting existing executable code destroys a part of the executable image.

PLT/GOT redirection attacks have proven extremely useful, so much so that read-only relocations (RELRO) is a standard mitigation on hardened systems. Thankfully for us as attackers, FreeBSD doesn’t use RELRO, and even if FreeBSD did, using ptrace to do devious things negates RELRO as ptrace gives us God-like capabilities. We will see the strength of PaX NOEXEC in HardenedBSD, preventing PLT/GOT redirections and executable code injections.

The Role of ELF

FreeBSD provides a nifty API for inspecting the entire virtual memory space of an application. The results returned from the API tells us the protection flags of each mapping (readable, writable, executable.) If FreeBSD provides such a rich API, why would we need to parse the ELF headers?

We want to ensure that we find the address of the system call instruction in a valid memory location.3 On ARM64, we also need to keep the alignment to eight bytes. If the execution is redirected to an improperly aligned instruction, the CPU will abort the application with SIGBUS or SIGKILL. Intel-based architectures do not care about instruction alignment, of course.

PLT/GOT hijacking requires parsing ELF headers. One would not be able to find the PLT/GOT without iterating through the Process Headers to find the Dynamic Headers, eventually ending up with the DT_PLTGOT entry.

We make heavy use of the Struct_Obj_Entry structure, which is the second PLT/GOT entry. Indeed, in a future version of libhijack, we will likely handcraft our own Struct_Obj_Entry object and insert that into the real RTLD in order to allow the shared object to resolve symbols via normal methods.

Thus, invoking ELF early on through the process works to our advantage. With FreeBSD’s libprocstat API, we don’t have a need for parsing ELF headers until we get to the PLT/GOT stage, but doing so early makes it easier for the attacker using libhijack, which does all the heavy lifting.

Finding the Base Address

Executables come in two flavors: Position-Independent Executables (PIEs) and regular ones. Since FreeBSD does not have any form of address space randomization (ASR or ASLR), it doesn’t ship any application built in PIE format.

Because the base address of an application can change depending on: architecture, compiler/linker flags, and PIE status, lib-hijack needs to find a way to determine the base address of the executable. The base address contains the main ELF headers.

libhijack uses the libprocstat API to find the base address. AMD64 loads PIE executables to 0x01021000 and non-PIE executables to a base address of 0x00200000 . ARM64 uses 0x0010-0000 and 0x00100000 , respectively.

libhijack will loop through all the memory mappings as returned by the libprocstat API. Only the first page of each mapping is read in–enough to check for ELF headers. If the ELF headers are found, then libhijack assumes that the first ELF object is that of the application.

image
image
image

Assuming that the first ELF object is the application itself, though, can fail in some corner cases, such as when the RTLD (the dynamic linker) is used to execute the application. For example, instead of calling /bin/ls directly, the user may instead call /libexec/ld-elf.so.1 /bin/ls . Doing so causes libhijack to not find the PLT/GOT and fail early sanity checks. This can be worked around by providing the base address instead of attempting auto-detection.

The RTLD in FreeBSD only recently gained the ability to execute applications directly. Thus, the assumption that the first ELF object is the application is generally safe to make.

Finding the Syscall

As mentioned above, we want to ensure with 100% certainty we’re calling into the kernel from an executable memory mapping and in an allowed location. The ELF headers tell us all the publicly accessible functions loaded by a given ELF object.

The application itself might never call into the kernel directly. Instead, it will rely on shared libraries to do that. For example, reading data from a file descriptor is a privileged operation that requires help from the kernel. The read() libc function calls the read syscall.

libhijack iterates through the ELF headers, following this pseudocode algorithm:

  • Locate the first Obj_Entry structure, a linked list that describes loaded shared object.
  • Iterate through the symbol table for the shared object:
    • - If the symbol is not a function, continue to the next symbol or break out if no more symbols.
    • - Read the symbol’s payload into memory. Scan it for the syscall opcode, respecting instruction alignment.
    • - If the instruction alignment is off, continue scanning the function.
    • - If the syscall opcode is found and the instruction alignment requirements are met, return the address of the system call.
  • Repeat the iteration with the next Obj_Entry linked list node.

This algorithm is implemented using a series of callbacks, to encourage an internal API that is flexible and scalable to different situations.

image
image
Creating a new memory mapping

Now that we found the system call, we can force the application to call mmap . AMD64 and ARM64 have slightly different approaches to calling mmap . On AMD64, we simply set the registers, including the instruction pointer to their respective values. On ARM64, we must wait until the application attempts to call a system call, then set the registers to their respective values.

Finally, in both cases, we continue execution, waiting for mmap to finish. Once it finishes, we should have our new mapping. It will store the start address of the new memory mapping in rax on AMD64 and x0 on ARM64. We save this address, restore the registers back to their previous values, and return the address back to the user.

The following is handy dandy table of calling conventions.

Arch

Register

Value

AMD64

rax

syscall number

 

rdi

addr

 

rsi

length

 

rdx

prot

 

r10

flags

 

r8

fd (-1)

 

r9

offset (0)

aarch64

x0

syscall number

 

x1

addr

 

x2

length

 

x3

prot

 

x4

flags

 

x5

fd (-1)

 

x6

offset (0)

 

x8

terminator

image

Currently, fd and offset are hardcoded to -1 and 0 respectively. The point of libhijack is to use anonymous memory mappings. When mmap returns, it will place the start address of the new memory mapping in rax on AMD64 and x0 on ARM64. The implementation of md_map_memory for AMD64 looks like the following:

image

Even though we’re going to write to the memory mapping, the protection level doesn’t need to have the write flag set. Remember, with ptrace, we’re gods. It will allow us to write to the memory mapping via ptrace, even if that memory mapping is non-writable.

image

HardenedBSD, a derivative of FreeBSD, prevents the creation of memory mappings that are both writable and executable. If a user attempts to create a memory mapping that is both writable and executable, the execute bit will be dropped. Similarly, it prevents upgrading a writable memory mapping to executable with mprotect , critically, it places these same restrictions on ptrace . As a result, libhijack is completely mitigated in HardenedBSD.

Hijacking the PLT/GOT

Now that we have an anonymous memory mapping we can inject code into, it’s time to look at hijacking the Procedure Linkage Table/Global Offset Table. PLT/GOT hijacking only works for symbols that have been resolved by the RTLD in advance. Thus, if the function you want to hijack has not been called, its address will not be in the PLT/GOT unless BIND_NOW is active.

The application itself contains its own PLT/GOT. Each shared object it depends on has its own PLT/GOT as well. For example, libpcap requires libc. libpcap calls functions in libc and thus needs its own linkage table to resolve libc functions at runtime.

This is the reason why parsing the ELF headers, looking for functions, and for the system call as detailed above works to our advantage. Along the way, we get to know certain pieces of info, like where the PLT/GOT is. libhijack will cache that information along the way.

In order to hijack PLT/GOT entries, we need to know two pieces of information: the address of the table entry we want to hijack and the address to point it to. Luckily, libhijack has an API for resolving functions and their locations in the PLT/GOT.

Once we have those two pieces of information, then hijacking the GOT entry is simple and straight-forward. We just replace the entry in the GOT with the new address. Ideally, the the injected code would first stash the original address for later use.

Case Study: Tor Capsicumization

Capsicum is a capabilities framework for FreeBSD. It’s commonly used to implement application sandboxing. HardenedBSD is actively working on integrating Capsicum for Tor. Tor currently supports a sandboxing methodology that is wholly incompatible with Capsicum. Tor’s sandboxing model uses seccomp(2) , a filtering-based sandbox. When Tor starts up, Tor tells its sandbox initialization routines to whitelist certain resources followed by activation of the sandbox. Tor then can call open(2) , stat(2) , etc. as needed on an on-demand basis.

In order to prevent a full rewrite of Tor to handle Capsicum, HardenedBSD has opted to use wrappers around privileged function calls, such as open(2) and stat(2) . Thus, open(2) becomes sandbox_open() .

Prior to entering capabilities mode (capmode for short), Tor will pre-open any directories within which it expects to open files. Any time Tor expects to open a file, it will call openat rather than open . Thus, Tor is limited to using files within the directories it uses. For this reason, we will place the shared object within Tor’s data directory. This is not unreasonable, since we either must be root or running as the same user as the tor daemon in order to use libhijack against it.

Note that as of the time of this writing, the Capsicum patch to Tor has not landed upstream and is in a separate repository.4

Since FreeBSD does not implement any meaningful exploit mitigation outside of arguably ineffective stack cookies, an attacker can abuse memory corruption vulnerabilities to use ret2libc style attacks against wrapper-style capsicumized applications with 100% reliability. Instead of returning to open , all the attacker needs to do is return to sandbox_open . Without exploit mitigations like PaX ASLR, PaX NOEXEC, and/or CFI, the following code can be used copy/paste style, allowing for mass exploitation without payload modification.

To illustrate the need for ASLR and NOEXEC, we will use lib-hijack to emulate the exploitation of a vulnerability that results in a control flow hijack. Note that due using libhijack, we bypass the forward-edge guarantees CFI gives us. LLVM’s implementation of CFI does not include backward-edge guarantees. We could gain backward-edge guarantees through SafeStack; however, Tor immediately crashes when compiled with both CFI and SafeStack.

In code on pages 581 and 582, we perform the following:

  • We attach to the victim process.
  • We create an anonymous memory allocation with read and execute privileges.
  • We write the filename that we’ll pass to sandbox_open() into the beginning of the allocation.
  • We inject the shellcode into the allocation, just after the filename.
  • We execute the shellcode and detach from the process
  • We call sandbox_open . The address is hard-coded and can be reused across like systems.
  • We save the return value of sandbox_open , which will be the opened file descriptor.
  • We pass the file descriptor to fdopen . The address is hard-coded and can be reused on all similar systems.
  • The RTLD loads the shared object, calling any initialization routines. In this case, a simple string is printed to the console.
The Future of libhijack

Writing devious code in assembly is cumbersome. Assembly doesn’t scale well to multiple architectures. Instead, we would like to write our devious code in C, compiling to a shared object that gets injected anonymously. Writing a remote RTLD within libhijack is in progress, but it will take a while as this is not an easy task.

Additionally, creation of a general-purpose helper library that gets injected would be useful. It could aid in PLT/GOT redirection attacks, possibly storing the addresses of functions we’ve previously hijacked. This work is dependent on the remote RTLD.

Once the ABI and API stabilize, formal documentation for libhijack will be written.

Conclusion

Using libhijack, we can easily create anonymous memory mappings, inject into them arbitrary code, and hijack the PLT/-GOT on FreeBSD. On HardenedBSD, a hardened derivative of FreeBSD, out tool is fully mitigated through PaX’s NOEXEC.

We’ve demonstrated that wrapper-style Capsicum is ineffective on FreeBSD. Through the use of libhijack, we emulate a control flow hijack in which the application is forced to call sandbox_open and fdlopen(3) on the resulting file descriptor.

Further work to support anonymous injection of full shared objects, along with their dependencies, will be supported in the future. Imagine injecting libpcap into Apache to sniff traffic whenever “GET /pcap” is sent.

image
image

/* sandbox_fdlopen . asm */
BITS 64
mov rbp, rsp

; Save registers
push rdi
push rsi
push rdx
push rcx
push rax

; Call sandbox_open
mov rdi, 0 x4000
xor rsi, rsi
xor rdx, rdx
xor rcx, rcx
mov rax, 0 x011c4070
call rax ; sandbox_open

; Call fdlopen
mov rdi, rax
mov rsi, 0 x101
mov rax, 0 x8014c3670
call rax ; fdlopen

; Restore registers
pop rax
pop rcx
pop rdx
pop rsi
pop rdi

mov rsp, rbp
ret

[ notice ] Tor 0.3.2.2 - alpha running on FreeBSD with Libevent
           2.1.8 - stable, OpenSSL 1.0.2 k-freebsd, Zlib 1.2.11,
           Liblzma N/A, and Libzstd N/A .
[ notice ] Tor can't help you if you use it wrong !
           https :// www.torproject.org / download / download # warning
[ notice ] This version is not a stable Tor release. Expect more
           bugs than usual.
[ notice ] Read configuration file "~/ installs / etc / tor / torrc".
[ notice ] Scheduler type KISTLite has been enabled.
[ notice ] Opening Socks listener on 127.0.0.1:9050
[ notice ] Parsing GEOIP IPv4 file ~/ installs / share / tor / geoip.
[ notice ] Parsing GEOIP IPv6 file ~/ installs / share / tor / geoip6.
[ notice ] Bootstrapped 0%: Starting
[ notice ] Starting with guard context "default"
[ notice ] Bootstrapped 80%: Connecting to the Tor network
[ notice ] Bootstrapped 85%: Finishing handshake with first hop
[ notice ] Bootstrapped 90%: Establishing a Tor circuit
[ notice ] Tor has successfully opened a circuit. Looks like
           client functionality is working.
[ notice ] Bootstrapped 100%: Done
This is from an injected shared object. You've been pwned.

FreeBSD system administrators should set security.bsd.unprivileged_proc_debug to 0 to prevent abuse of ptrace . To prevent process manipulation, FreeBSD developers should implement PaX NOEXEC.

Source code is available.5

image

17:08 Murder on the USS Table

by Soldier of Fortran
concerning an adventure with Bigendian Smalls

The following is a dramatization of how I learned to write assembler, deal with mainframe forums, and make kick-ass VTAM USS Tables. Names have been fabricated, and I won’t let the truth get in the way of a good story, but the information is real.

It was about eleven o’clock in the evening, early summer, with the new moon leaving an inky darkness on the streets. The kids were in bed dreaming of sweet things while I was nursing a cheap bourbon at the kitchen table. Dressed in an old t-shirt reminding me of better days, and cheap polyester pants, I was getting ready to call it a night when I saw trouble. Trouble has a name, Bigendian Smalls. A tall, blonde, drink of water who knows more about mainframe hacking than anyone else on the planet, with a penchant for cargo shorts. I could never say no to cargo shorts.

The notification pinged my phone before it made it to Chrome. I knew, right then and there I wasn’t calling it a night. Biggie needed something, and he needed it sooner rather than later. One thing you should know about me, I’m no sucker, but when a friend is in need I jump at the chance to lend a hand.

Before opening the message, I poured myself another glass. The sound of the cheap, room temperature bourbon cracking the ice broke the silence in my small kitchen, like an e-sport pro cracking her knuckles before a match. I opened the message:

“Hey, I need your help. Can you make a mainframe logon screen for Kerberos? But can you add that stupid Windows 10 upgrade popup when someone hits enter?”

“Yeah,” I replied. I’m not known for much. I don’t have money. I’m as cheap as a Garfield joke in the Sunday papers. But I can do one thing well: Mainframe EBCDIC Art.

I knew It was going to be a play on Cerberus, the three-headed dog. Finding that ASCII was the easy part. ASCII art has been around since the creation of the keyboard. People need to make art, regardless of the tool. Finding ASCII art was going to be simple. Google, DuckDuckGo, or in desperate times and lots of good scotch, Bing, will supply the base that I need to create my master piece. The first response for a search for “Cerberus” and “ASCII” yielded my three-headed muse.

The rest, however would require a friend’s previous work, as well as a deep understanding of the TN3270 protocol and mainframe assembler.

image
image

When I got in to this game six years ago it was because I was tired of looking at the red “ Z .” That red was rough, as though accessing this mainframe was going to lead me right to Satan himself. (Little did I know I’d actually be begging to get by Cerberus.)

The world of mainframes, it’s a different world. A seedier world. One not well-traveled by the young, often frequented by the harsh winds of corporate rule. Nothing on the mainframe comes easy or free. If you want to make art, you’ll need more than just a keyboard.

I started innocently enough, naively searching simple terms like “change mainframe logon screen.” I stumbled around search results, and into chatrooms like a newborn giraffe learning to walk. You know the type, a conversation where everyone is trying to prove who’s the smartest in the room. While ultimately useless, those initial searches taught me three things: I needed to understand the TN3270 protocol, z/OS High Level Assembler (HLASM), and what the hell a VTAM and the USS Table were.

I always knew I would have to learn TN3270. It’s the core of mainframe–user interaction. That green screen you see in movies when they say someone “just hacked a mainframe.” I just never thought it would be to make art for my friends. TN3270 is based on Telnet. Or put another way, Telnet is to TN3270 as a bike is to an expensive motorcycle. They sort of start out the same but after you make the wheels and frame they’re about as different as every two-bit shoe shine.

Looking at the way mainframes and their clients talk to one another is easy enough to understand, at first. Take a look at Figure 17.39.

For anyone who understood telnet like I did, this handshake was easy enough to understand.

image

But that’s where it ended. Once the client was done negotiating the telnet options, the rest of the data looked garbled if you weren’t trained to spot it.

You see, mainframes came from looms. Looms spoke in punch-cards which eventually moved to computers speaking EBCDIC. So, mainframes kept the language alive, like a small Quebec town trying to keep French alive. That TN3270 data was now going to be driven by an exclusively EBCDIC character set. All the rest of the options negotiated, and commands sent, would be in this strange, ancient language. Lucky for me, my friend Tommy knows all about TN3270 and EBCDIC.0 And Tommy owed me a favor.

image

Figure 17.39: TN3270 Packet Trace

Just past a Chinese restaurant’s dumpster was the entrance to Tommy’s place. You’d never know it even existed unless you went down the alleyway to relieve yourself. As I approached the dark green door, I couldn’t help but notice the pungent smell of decaying cabbage and dreams, steam billowing out of a vent smelled vaguely of pork dumplings. I knocked three times. The door opened suddenly and I was ushered in. I felt Tommy slam the door shut and heard no fewer than three cheap chain-locks set in to place.

Tommy’s place was stark white, like a website from the early nineties. No art, no flashing neon, just plain white with printouts stuck on the white walls and the quiet hum of an unseen computer. The kind of place that makes you want to slowly wander around an Ikea. Tommy liked to keep things clean and simple and this place reflected that.

Tommy, in his white lab coat, was a just a regular man. As regular and boring as a vodka with lime and soda, if vodka, with lime and soda, wore large rimmed glasses. But he knew his way around TN3270, and that’s what I needed right now.

“So, I hear you need some help with TN3270?” Tommy asked. He already knew why I was there.

“Yeah, I can’t figure this garbage out and I need help writing my own,” I replied.

Tommy sighed and began explaining what I needed to know. He walked over to one of three whiteboards in the room.

“The key thing you need to know is that after you negotiate TN3270 there are seven control characters. But if all you want to do it make art, you only need to know these four:

image
image
image

Figure 17.40: Placing the cursor after drawing.

image

“Unlike telnet, TN3270 is a basically 1920 character string, for the original 24 × 80 size. The terminal knows you’re starting ’cuz the first byte you send is a command (i.e. x05 ) followed by a Write Control Character (WCC). For you, sir artist, you’ll want to send ‘Erase/Write/Alternate.’ or xF5x7A . This gives you a blank canvas to work with by clearing the screen and resetting the terminal.

“The remaining makeup of the screen is up to you. You use SBA to tell the terminal where you want your cursor to be, then use the ‘Start Field’/‘Start Field Extended’ commands to tell the terminal what kind of field it is going to be, also known as an attribute. Start field is used to lock and unlock the screen, but for your art it doesn’t matter.

“One thing you’ll need to watch out for, anytime you use SF/SFE, is that it takes up one byte on the screen. Setting the buffer location does not. Once you’re done with your art, you’ll need to place the cursor somewhere, using IC.”

Starting to understand, I headed to the white board and wrote Figure 17.40 in black marker.

“Yes! That’s it!” exclaimed Tommy. “With what you have now, you could make a monochrome masterpiece! Keep in mind that the SF eats up one space. So basically you could fill out the rest of the screen’s 1,919 characters, remembering that the line wraps at every 80 characters. But let’s talk about SF and SFE.”

“In your, frankly simple, example,” Tommy continued, “you’d never get any color. To do that, we need to talk about the Start Field Extended (x29 ) command. That command is made up of the SFE byte itself, followed by a byte for the number of attributes, and then the attributes themselves.

“There’s two attributes we care about: SF ( xC0 ), and the most important one, which I’ll get to in a minute. SF is what we use like above to control the screen. If we wanted to protect the screen from being edited we could set it to xF8.

image

Figure 17.41: Tommy’s Color Example

“Now, you’ll want to listen closely because this attribute is arguably the most important to you. The color attribute ( x42 ) lets you set a color. Your choices are xF1 through xF7 .”

image

Tommy grabs the black marker from my hand and begins adding to my simple example. (Figure 17.41.)

“So, with a bit of this code, we can add a color statement to your commands. Remember to move the cursor to the end though.

“There’s one last thing you should know, but it’s a little advanced. You can set the location using SBA followed by a row/-column value. Right now, you’ve set the buffer to 0/0. But using this special table,” Tommy pointed to a printout he had laminated and stuck to his wall,1 “we can point the buffer anywhere we—”

Just then the door burst open, the sounds of those cheap locks breaking and hitting the floor echoed through the room. A dark figure stood in the doorway holding some type of automatic gun, which I couldn’t place. Tommy quickly took cover behind a desk and I followed suit. I heard a voice yell out “How dare you teach him the way! He might not have the access he needs! Did you ask if he’s allowed to make the kind of changes you’re teaching? He should’ve spoken to his system programmer and read the manuals!”

Tommy, visibly shaken, shouted, “Rico! I’m sorry! I owed someone a favor and. . .”

Rico opened fire. Little pieces of shattered whiteboard hitting me in the face. He wasn’t aiming for us, but had destroyed our notes on the white board. I looked over and saw Tommy cowering under his desk, I had figured ‘Tommy’ was a nickname for a favorite firearm, guess I was wrong.

“You’ve given out free TN3270 help for the last time Tommy!” Rico shouts, and I heard the familiar sound of a gun being reloaded. I took a quick peek from my hiding place and noticed that Rico hadn’t even bothered to take cover, still standing in the doorway. Not wanting my epitaph to read, “Here lies a coward who died learning TN3270 behind a Chinese restaurant,” I pulled out my Colt detective special and opened fire. My aim had always been atrocious, but I fired blindly in the direction of the door, heard a yelp, and then silence.

Tommy popped his head above the desk, “He’s gone, looks like he ran off, you better get out of here in case he and his goons return.”

I took this as my cue and headed towards the door. I noticed part of the frame had splintered, and in the center of those splinters was my slug. looks like I just missed Rico.

Tommy grabbed my arm as I’m about to leave, “You still need to learn some assembler and VTAM, go talk to Dave at The Empress, he can help you out. But never come back here again, you’re too much trouble.”

The Empress. On the books it was a hotel. Off the books it’s where you went when you wanted help forgetting about the outside world. The lobby looked and smelled like a cheap computer case that hadn’t been cleaned out for years. Half the lights in the chandelier didn’t work, and it cast odd shadows on the furniture, giving the impression someone was there, watching you. It was the kind of place European tourists booked because Travelocity got them a great deal, but the price would immediately change once they arrived. No one came to the Empress for its good looks. Not-quite-top-40 music emanated from the barroom.

I walked to the front desk, where a young man with a name tag that said “No Name” looked me up and down. “Can I help you?” Millennial sarcasm dripped off of every syllable. “I need to speak to Dave,” I replied. The clerk’s eyes widened a little, he quickly looked around and whispered “follow me.”

The clerk walked me past the kitchen, through the back hallways, in to the laundry room. He ushered me in, then abruptly left. A sole person was folding linens in front of an industrial washing machine, a freshly lit cigarette hung loosely from his lips. The fluorescent light turned his skin a pale shade of blue. “Dave?” I called out.2 Dave put the bed sheet down and walked over. “Who wants to know?” he asked.

“Tommy sent me,” I replied.

Dave takes a long pull on his coffin nail, “Shit,” he says exhaling a large puff, “you tell Tommy that we’re square after this. I assume you’re here to learn HLASM? Can I ask why?”

“I’m trying to make some my mainframe look better.” I replied.

Dave wasn’t a tall man, but his stature, deep voice, and frame more than made up for it. The type of man you could trust to knock you out in one punch. His white hotel uniform was stained with what I hoped wasn’t blood.

image

He sighed and said “this way.”

Dave led me to a small room off the laundry area with some books on the wall, lit by a single, bare bulb in the ceiling fixture. A black chalkboard stood in one corner, an old terminal on a standing desk, all the rage these days, at in the other. The walls were bare concrete. “I assume you already know JCL?” queried Dave.

“Yes” I replied with a failed attempt at sarcasm, “of course I know JCL.”3

“Good, this will be easy then.” He took another pull of his smoke and began writing on the blackboard, “There’re four executables available to you to compile an HLASM program on the mainframe. They are:

image

Dave walked over to the terminal and pulled up a file on the screen. “You need to pass it some options, like this,” he said, pointing to a line on the screen:

image

“Anything you type on the next line, after the * must be in HLASM and will be compiled by ASMACL. Don’t worry about finding it, ASMACL is given to us by Big Blue.” Dave’s calloused fingers flew over the keyboard and a moment later I was staring at a blank file with the JCL job card and compiler stuff filled out. “First, there’re some rules with HLASM you should know. Each line can either be an instruction, continuation, or comment. Comments start with ‘* ’. A Continuation line means that in the previous line there’s a character (any character, doesn’t matter which) in column 72, and the continued line itself must start on column 16.”

image

Figure 17.42: Dave’s Continuation Example

“You with me so far?”

I nodded.

“Good. Now, If it’s not a comment or a continuation, the line can be broken down like so:

“The first 10 characters can be empty or be a name/label. Following that you have your instruction, a space, then your operands for that instruction. Anything after the operands is a comment until the 71st column. Here’s a dirty example.” (Figure 17.42.)

“Every line can have a name. In HLASM you can create basic variables with an & in front of them. But not every line needs a name. Take a look at these three lines:

image

“Line one sets a symbol/label to &BLUE. If Tommy did his job right you should be able to recognize what it is supposed to do. The next line is DC, Declare Constant. Notice &BLUE has an X. That means it’s in hex. When we want to send text, we can use ‘C’ for CHAR . If we wanted we could’ve written the above like this.” I watched as his fingers danced across the keyboard.

image

“But you’ll likely be switching colors, so setting them all to variables makes your life easier. One caveat with using variables in HLASM: The assembler will replace any value you have with the variable, take a look at this:

image

“Lets break this down so you can see what the compiler would do:

image

“This understanding will come in handy when you’re making a USS Table.” I still didn’t know what a USS Table was, but I let him go on. “If you have stuff you’re going to do over and over again, it would be easier to make a function, or in HLASM a macro, to handle the various request types. Macros are easy. On a single line you declare ‘MACRO’ in column 10. The next line you give the macro a name, and it’s operands. You end a macro with the word ‘MEND ’ in column 10 on a single line. For example:”

image

I thought I was starting to get it, so I decided to ask a question. “How would we do an IF statement?” I asked.

Dave smiles, but only a little, and walks back over to the blackboard and scribbles out the following:

image

“In HLASM you can use the AIF instruction. It’s kind of like an IF. Here we have some code that will print ‘Not Equal to 2’ and ‘End of Line.’ If we set &MSG to ‘02’ it would jump ahead to .SKIP , what Big Blue would call a label.

“I see you staring at that ANOP. I know what you’re thinking, and the answer is yes. It’s exactly like a NOP in x86. Except it’s not an opcode, but a HLASM assembler instruction.”

Dave headed back to the terminal and quickly scrolled to the bottom. “There’s one last thing, since we’re using ASMACL you need to tell the compiler where to put the compiled files. Take a look at this.”

image

Dave tapped on the glowing screen. “This line right here. This tells the compiler to make a file USSCORP in the folder USER.VTAMLIB.” I knew he meant Member and Partitioned Dataset but I figured Dave was dumbing things down for me and didn’t want to interrupt. “That’s where your new USS Table goes,” he continued.

I jumped as someone softly knocked on the door, guess I was still a little jumpy from my encounter at Tommy’s. I saw through the round window in the door that the clerk had returned. Dave headed over and opened the door. I couldn’t quite make out what they were saying to each other. Dave looked at his watch and turned to me, “Look, this has been swell, but you gotta get outta here. If my boss finds out I taught you this there’ll be hell to pay and I’m not looking to sleep with the fishes tonight—or any night. Sorry we’re cutting this short, normally I’d be teaching you about the 16 registers and program entrance and exit, but we don’t have time for that. And besides, you don’t need it to be a VTAM artist, but if you want to learn, read this.” And he shoved a rather large slide deck in to my chest, at least 400 pages thick.4

No Name told me to follow him yet again. As we left the laundry room I saw Dave stuffing soiled linens in to one of those washers; this time there’s no wondering if it was blood or not. No Name ushered me down a different hallway than the one we came in. He walked quickly, with purpose. I struggle to keep up.

We ended up at a door labeled ‘Emergency Exit.’ No Name opened the door and I headed through. Before I could turn around to say thanks, the big metal door slammed closed. I found myself in another dead-end alleyway. The air was cool now, the wind moist, betraying a rain fall that was yet to start.

I began heading towards the road when a shadowy figure stepped into the alley. I couldn’t make out what he looked like, the neon signs behind him made a perfect silhouette. But I could already tell by his stance I was in trouble.

“So,” the figure called out, “the boss tells me you’re trying to change the USS Table eh?” I figured this must be one of Rico’s goons.

“I don’t mean nothing by it,” I replied, “I’m just trying to make my mainframe nicer.”

“Rico has a message for you ‘if you’re trying to change the mainframe you should be talking to the people who run your mainframe, I’ve had enough of this business.’”

The gunshot echoed through the alleyway, the round hitting me square in the chest like a gamer punching his monitor in a rage quit. I landed on flat my back, smacked my head on the cold concrete, and sent pages of assembler lessons flying through the air. The wind knocked out of me, I felt the blackness take hold as I lay on the sidewalk. I could barely make out the figure standing over me, whispering “when you get to the pearly gates, tell ’em the EF Boys sent ya.”

You know those dreams you have. The kind where you’re in a water park, floating along a lazy river, or down a waterslide. I was having one of those. It was nice. Until I realized why I was dreaming of getting wet. I woke face up, in an alleyway, the rain pounding me mercilessly. My trench coat was drenched by the downpour. I stood up, slowly, still dizzy from getting knocked out.

How had I survived? I looked around and saw papers strewn about the alley. Something shiny, just next to where I took my forced nap, caught my eye. It was a neat pile of papers, held together by a dimple on the top sheet. I took a closer look and picked up the pages.

Well I’ll be damned, the 400+ pages of assembler material took the bullet for me! Almost square in the middle was the bullet meant to end my journey. I eternally grateful that Dave had given me those pages. Now, determined more than ever to finish what I started, I headed toward the street. I had two of the three pieces to the puzzle, but I needed dry clothes and my office was closer than going home.

Nestled above a tech start-up on its last legs was a door that read ’Soldier of FORTRAN: Mainframe Hacker Extraordinaire.’ Inside was a desk, a chair, an LCD monitor and a PC older than the startup. A window, a quarter of the Venetian blinds torn free, looked out over the street. I didn’t bother turning on the lights. The orange light that bled in from the lamppost on the street was enough. I pulled out my phone, put it on the desk, and started changing in to my dry clothes. The clothes were for when I hoped I would start biking to work which, as with all new year’s resolutions, were yesterday’s dream.

Now dry, I decided to power on my PC and take some notes. I wrote down what I knew about TN3270 thanks to Tommy and HLASM courtesy of Dave. I was still missing a big piece. Where could I learn about this USS Table. My searches all led to the same place: The Mailing-List. A terrible bar on the other side of town I had no desire to visit. The Mailing-List, or ‘Dash L’ as some people called it, was filled with some of the meanest, least helpful individuals on this Big Blue planet. I was likely to get chased out of the place before I was even done asking my question, let alone receiving an answer.

Don’t get me wrong, sometimes Dash L had some great conversations; I know because I often lurk there for information I can use. But I had never worked up the courage to ask a question there, lest I be banned for life. With nothing else to go on, I grabbed my coat and umbrella and headed for the door.

Just then, my phone rang. I didn’t recognize the name-Nigel, or the number. I decided to answer the phone. “Who’s this, how’d you get my private number?” No reply. I went to hang up the phone when I heard, “try searching for USSTAB and MSG10.” My phone vibrated, letting me know the call was over. I ran to the window and peered out in to the rainy night. The street was empty except for a man with an umbrella putting his phone away. I ran down the stairs and caught a glimpse of the man as he got into his Tesla and sped off.

Back at my desk, I searched for USSTAB and MSG10 and one name kept coming back: Big John. I knew Big John, of course. Anyone who did mainframe hacking knew him. He now played the ivories over at a fancy new club, the Duchess. My dusty work clothes would have to be fancy enough.

————

You wouldn’t know the Duchess was much, just by looking at it. A single purple bulb above a bright red vinyl entrance. The lamp shade cast a triangle of light over the door. The only giveaway that this was a happening place was the sound of 80s Synth rolling down the streets. Not the cheap elevator synth you get while waiting for your coffee, this was real synth: soulful and painful. The kind that made you doubt yourself and your life choices.

I walked to the door and knocked. A slit opened up, “Can we help you?” a woman’s voice asked. I couldn’t wait for this new speakeasy revival trend to die. “Yes,” I replied, “I’m here to see Big John.”

“You have a reservation?” she asked.

“Nope, just here to see Big John.”

“Honey, you outta luck. We got a whole room of people here to see Big John, and they got reservations!”

“How much sweetener to see him play tonight?” I ask.

A second slot near my dad gut opened up, and a drawer popped out, almost like the door was happy to see me. I placed the only fifty I had in the tray. The drawer and slit closed and the door opened.

A young woman took my coat and brought me to a table. I took my seat and casually looked around. The room was dimly lit, with most of the light coming from the stage. Smoke hung in the air like a summer haze waiting for a good thunderstorm. A waitress asked, “Drink sir?” I ordered a dirty martini and enjoyed the rest of the show. It’d been a shit day, I needed a break.

Once the show was done and the band started to pack up, I walked up to Big John. “Apparently you’re a man who can help me with USSTAB and some TN3270 animations.” I say. He finished putting away his keytar in its carrying case. “I could be, what’s in it for me?” My wallet was empty so I figured a play on his emotional side might work, “You’d get a chance to piss off Rico and the EF Gang.”

Big John looked at me and smiled. “Anything to piss of that hothead, follow me.” I grabbed my coat from the front and followed him.

Big John was the type of guy who lived up to the name. He was massive. Use to play professional football before he got injured and went back to his original loves: hacking and piano. Long dark hair and an even longer and darker beard made him look menacing. But if you ever knew Big John, you’d know he was just a big ‘ol softy.

John led me to another alleyway behind the Duchess. What was it with this city and alleyways? It looked like the rain had let up, but it had left a cold, damp feeling in the air. Parked in the alley was a van, with a wizard riding a corvette painted on the side. Big John opened the back, set his keytar down and motioned for me to get in the van.

Inside was a nicer office space than I have. Expensive, custom mechanical keyboards lined one wall. Large 4k monitors hung on moveable arms. An Aeron chair was bolted to the floor. Somewhere, invisible to me, was a computer powerful enough to drive this setup.

“So, I take it you’ve been to both Tommy and Dave already?” he asked over the clicking of his mechanical keyboard as he logged on.

“Yes,” I reply. “I think I understand enough to get started making my own logon screens. I can control the flow and color of a TN3270 session, and I know how to use HLASM to do so. But Dave kept referring to things like MSGs and a USS Table which makes no sense to me.”

Big John chuckled and sat down, lighting what looked like a hand-rolled cigarette but smelled like a skunk. “Don’t worry about Dave,” he said, taking a few puffs, “he’s an ex-EF Boy, he’s still trying to get use to sharing information that people can understand. Sometimes he’s still a little cryptic. Let’s get started.”

“When you connect to a mainframe, nine times outta ten its going to be VTAM,” Big John explains.

“VTAM is like the first screen of an infocom game. It lets you know where you are, but from there it’s up to you where you go, you get me?” he asks between puffs.

I did, and I didn’t. All I wanted to do was make pretty mainframes.

“First thing you gotta know about VTAM is that it uses what it calls Unformatted System Services tables. Or USS tables for short. This file is normally specified in your TN3270 configuration file.” Big John swiveled his chair and launched his TN3270 client, connected, and opened a file labeled USER.TCPPARMS(TN3270) He pointed to a specific line:

image

“This line right here tells TCP to tell VTAM to use the file ’USSECORP’ when a client connects.” he said, closing the file. He then opened ’USER.PROCLIB(TN3270)’ and pointed at a different line:

image
image
image

“And that right there is where we’re gonna find USSECORP,” again he closed the current file and opened another folder: USER.-VTAMLIB. And sure enough, glowing a deep blue, in the back of this van was USSECORP!

image

“So now you know where to send your compiled HLASM, your L.SYSLMOD. Just overwrite that file and you’ll be good to go. Oh wait!” John laughed, “I haven’t explained how you can use the USS Table to make it less boring. Right, well it’s easy—ish.

“The USS Table is basically a set of macros you call to tell VTAM what to do on each message or command it receives. Let’s take a look at this example.” He pointed to the other screen.

image

“We start the USS Table with the Macro USSTAB passing it the argument FORMAT. Just always set it to DYNAMIC. This is saying, from here on out we’re in USSTAB. The next line, it’s important.”

image

“This calls the USSMSG macro, which you can read in SYS1-.SISTMAC1(USSMSG). You can pass it a bunch of variables, but for you, just pass it the MSG= and BUFFER= variables. MSG=10 in our case is the default ‘hey you just connected’ message. BUFFER takes two arguments. SCAN will look through and replace any instance of keywords with the actual variable. Some examples would be @@@@DATE and @@@@TIME. Which would replace those items with the actual date/time. BUF010 is a pointer. It points to a data structure. The first thing BUFFER expects is the length of the buffer. Since we might add/remove more to our screen we can use just get the total size by subtracting the location of END010 by BEGIN010. Everything else inside there is what will be sent to VTAM to send to your TN3270 emulator. You keepin’ up my man?”

“Yeah,” I replied. “I think I got it. That line X’2902C0F842F1’ is a TN3270 command setting the text blue ( x42 xF1 ) and that other line, two down, with 10C , just means to repeat that space ten times before we insert the cursor.”

John smirked, “well look at you, the artist. When you’re done setting USS Tab stuff you just end it with USSEND. Keep in mind, there’re fourteen MSGs, not that you’ll need to deal with them if you don’t want to.”

Big John got up and settled into the driver’s seat, “Where ya headin?” he asked. I guess he was done teaching me what I needed to learn. “Fifth and Gibson,” I replied. Back to my office. I was eager to get started on my own screen now that I knew what I was doing. I buckled in next to Big John and got to the office, thankfully no sight of Rico or his EF Boys.

————

Back at my desk I created two things. First, I made a quick and dirty python script so I could rapidly prototype TN3270 command ideas I had (included). Second I decided to code up a macro to handle all the MSG types:

First we needed that sweet, sweet JCL header:

image

Next, I needed a way to handle all the messages. I whipped up a quick macro, with all the colors I might need.

image

I needed to address each of the messages, so I did that here. STDTRANS I copied from Big Blue themselves.

image

After that I call the macro for every msg type and end the HLASM.

image

Finally, I added the JCL footer.

image

Happy with the code I’d just written I made myself a screen I’d be happy to see each and every day:

image

I shut down my computer, ordered an Uber, and headed out of the office.

A car pulled up as I looked up from my phone. This wasn’t my Uber, this was a Tesla, a black Tesla. The back door opened. Rico sat in the back, his one eye covered with a patch, gave him the look of a pirate, as did the gun he had pointed at my face. “Get in,” he said, motioning with the large revolver. Having no other option, I shrugged and got in the back of this Tesla, wondering how much this no-show was gonna cost me on Uber. The Tesla sped off, and slammed me in to the back of my seat.

After a few moments of silence, “Just who the fuck do you think you are?” Rico asked.

“Hey, Rico, all I wanted to do was make a nice logon screen for my mainframe.” I quipped. This visibly upset Rico. The driver quietly snickered in the front seat, then said “This guy thinks he’s a sysprog now?”

“Shut up Oren!” Rico turned to me, “It works like this: we control the information. We decide who knows what. You’re wastin’ everyone’s time over some aesthetic changes. The very fact that you phrase it as ‘logon screen’ means you’re not ready to know this information!”

I stammered a response, “Look, I don’t get what the big deal is, if you don’t want to help who cares?” and I showed him a screenshot of my mainframe.

This was not a good idea. Rico’s face turned bright red. “BULLSHIT! You’ve wasted plenty of people’s time! Tommy, Dave, John. You should’ve gone back and read the manuals, like I had to. All 14,000 pages. Instead, you want a short cut. A hand out. Well, sonny, nothing comes easy. There is no possible way your system didn’t come with customization rules, documentation and changes. That just not how it’s done!”

I realized at this point Rico had never heard about the fact that you can emulate your own mainframe at home.5 Oren, turned his head to look at me, “Yeah, there ain’t no way you get to run your own system and do what you want all willy-nilly.”

I noticed the red light before Oren and Rico, and got ready to put a dumb plan in to action. Oren slammed on the brakes and sent Rico flying in to the seat in front of him. Why don’t bad guys ever wear their seatbelts? While Rico was slightly stunned, I lunged and wrestled the gun free from his hands. At the same time, I grabbed my own pea shooter and pointed one each at Oren and Rico.

“Enough of this shit,” I yelled, “you’re too late anyway, I’ve already built and replaced my USS Table.” I made sure to use the correct terminology now. “I already shot and missed you once today Rico, I won’t miss a second time. Now let me out of this car!”

“Ok, ok. Cool it.” said Oren as he slowed the car. Rico just sat and stewed.

I stepped out of the car. “This isn’t the last you’ve heard from us!” Rico yelled, and the black Tesla sped off in to the night.

He was right, of course. It wouldn’t be the last time I clashed with the EF gang and lived to tell about it.

————

I couldn’t believe that was six years ago. Bigendian knew to reach out to me because I had done some nice screens for him in the past. My skills at making EBCDIC art since then had improved vastly.

image
image

Thanks to another meeting years later with Big John, I learned you can add lines and graphics to make shapes using the rarely documented SFE GE SHAPE ( x08 ) command. At this point, I had the three-headed beast as a rough idea in my head what I wanted the screen to look like. But, I needed a way to animate the Windows 10 update nag screen.

Like a small dog running in to a screen door, it hit me. I could use the MSGs and an AIF to display the nag screen!

You see, when you first connect, that’s a MSG10 screen. If you hit enter, to the user it appears as though the screen just refreshed. But what’s really happening is VTAM loads a MSG02 screen. Because you entered an invalid command (nothing). I could use an AIF statement to only show the Windows 10 nag screen if an invalid command was entered.

Above, where I declared the colors, I could also declare some shapes:

image

And, with the help of Tommy’s table, the one that gave me the coordinates for screen positions, and Big John’s graphics, I could overlay the nag box on the screen. But only if the MSG is type 02. See code on page 618.

With that final piece of the puzzle I gave Bigendian Smalls a short demo.

image

Then I hit <enter> , and it all came together.

image

“Wow, that’s really awesome.” he replied over ICQ. It sure was.

Appendix: Code Listing
image
image
image

17:09 Infect to Protect

by Leandro “acidx” Pereira

Writing viruses is a sure way to learn not only the intricacies of linkers and loaders, but also techniques to covertly add additional code to an existing executable. Using such clever techniques to wreck havoc is not very neighborly, so here’s a way to have some fun, by injecting additional code to tighten the security of an ELF executable.

Since there’s no need for us to hide the payload, the injection technique used here is pretty rudimentary. We find some empty space in a text segment, divert the entry point to that space, run a bit of code, then execute the program as usual. Our payload will not delete files, scan the network for vulnerabilities, self-replicate, or anything nefarious; rather, it will use seccomp-bpf to limit the system calls a process can invoke.

Caveats

By design, seccomp-bpf is unable to read memory; this means that string arguments, such as in the open() syscall, cannot be verified. It would otherwise be a race condition, as memory could be modified after the filter had approved the system call dispatch, thwarting the mechanism.

It’s not always easy to determine which system calls a program will invoke. One could run it under strace(1) , but that would require a rather high test coverage to be accurate. It’s also likely that the standard library might change the set of system calls, even as the program’s local code is unchanged. Grouping system calls by functionality sets might be a practical way to build the white list.

image

Which system calls a process invokes might change depending on program state. For instance, during initialization, it is acceptable for a program to open and read files; it might not be so after the initialization is complete.

Also, seccomp-bpf filters are limited in size. This makes it more difficult to provide fine-grained filters, although eBPF maps0 could be used to shrink this PoC so slightly better filters could be created.

Scripting like a kid

Filters for seccomp-bpf are installed using the prctl(2) system call. In order for the filter to be effective, two calls are necessary. The first call will forbid changes to the filter during execution, while the second will actually install it.

The first call is simple enough, as it only has numeric arguments. The second call, which contains the BPF program itself, is slightly trickier. It’s not possible to know, beforehand, where the BPF program will land in memory. This is not such a big issue, though; the common trick is to read the stack, knowing that the call instruction on x86 will store the return address on the stack. If the BPF program is right after the call instruction, it’s easy to obtain its address from the stack.

image

The BPF virtual machine has its own instruction set. Since the shellcode is written in assembly, it’s easier to just define some macros for each BPF bytecode instruction and use them.

image
image

By listing all the available system calls from syscall.h ,1 it’s trivial to write a BPF filter that will deny the execution of all system calls, except for a chosen few.

image
Infecting

One of the nice things about open source being ubiquitous today is that it’s possible to find source code for the most unusual things. This is the case of ELFKickers, a package that contains a bunch of little utilities to manipulate ELF files.2

I’ve modified the infect.c program from that collection ever so slightly, so that the placeholder jmp instruction is patched in the payload and the entry point is correctly calculated for this kind of payload.

A Makefile takes care of assembling the payload, formatting it in a way that it can be included in the C source, building a simple guinea pig program twice, then infecting one of the executables. Complete source code is available.3

image
Testing & Conclusion

See the excerpt of a system call trace, from the moment that the seccomp-bpf filter is installed, to the moment the process is killed by the kernel with a SIGSYS signal.

Happy hacking!

image

Excerpt of strace(1) output when running hello.c .

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.187.178