It is a challenge to prototype network applications such as NAT that needs compute-intensive packet header processing while keeping the line speed on programmable network processors. In this paper, we design, implement, and evaluate a NAT subsystem capable of run-time adaptation on an experimental board containing a pair of Intel IXP2400 network processors, which operates in switch-over mode (NAT or NAPT) based on the fullness of the available global addresses or user configuration. We evaluate and validate our system through simulations and hardware experiments. It is found that the bottleneck of the system is due to the DRAM access latency. Also, we demonstrate that our NAT subsystem can support more than five hundreds of thousands of concurrent TCP/UDP sessions and sustain the full line rate on two Gigabit Ethernet links. Our experimental results and architecture can contribute to the other designs and implementations of network services over programmable network processors since they have similar architectures, functionalities and components1.