If you are programming an Atmega328-PU for the first time out of the box , DO solw down the bit-clock of the programmer either using the -B option of avrdude or the switch on the programmer hardware.
This is absolutely the most frustrating experience in programming a microcontroller.
The Atmega family from Atmel is a wonder of semplicity and power, BUT you need to know your devil.
For every number (Atmega168, Atmega88, Atmega 328 and so on ... ) there are different version of the processor, namely the V the P the Pu and other, they are not the same !!! mainly they change the signature, this make them effectively different, look in the datasheet to see which one is the right one for the processo you have, if you don choose the right one in the programmer you can't program the chip.
Back to the Atmega 328-PU, this small bugger it's a '168 with double the memory, so, for example, you can fit double the program length from Arduino, BUT it's different from the '328 used in Arduino, they use a '328P, low power version, and you have a 328-PU more generally available.
To use it with Arduino look in the Arduino forum and search for Atmega328 or Atmega328PU, there you will find the reference to what to change in the Arduino ide to program the bootloader.
Nobody if not vaguley mention the fact that you have to slow down the bit-clock, coupled with the fact that I never need to do so with the Atmega168 or Atiny45 ( I've programmed them 100's of time), it took me a whole afternoon to find what was wrong, in the end I flipped the switch that slow the bit-clock in the programmer and everything went fine...
Gosh Batman ! slow that CLOCK !!!!