A processor includes a widest set of data registers that corresponds to a given logical processor. Each of the data registers of the widest set have a first width in bits. A decode unit that corresponds to the given logical processor is to decode instructions that specify the data registers of the widest set, and is to decode an atomic store to memory instruction. The atomic store to memory instruction is to indicate data that is to have a second width in bits that is wider than the first width in bits. The atomic store to memory instruction is to indicate memory address information associated with a memory location. An execution unit is coupled with the decode unit. The execution unit, in response to the atomic store to memory instruction, is to atomically store the indicated data to the memory location.