Sign Up
Log In
Log In
or
Sign Up
Places
All Projects
Status Monitor
Collapse sidebar
home:lafenghu
gcc43
gcc-power7-sles-11sp1.patch02a
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
File gcc-power7-sles-11sp1.patch02a of Package gcc43
2009-10-15 Michael Meissner <meissner@linux.vnet.ibm.com> Pat Haugen <pthaugen@us.ibm.com> Revital Eres <ERES@il.ibm.com> Peter Bergner <bergner@vnet.ibm.com> Backport rs6000 changes from GCC 4.5 mainline to 4.3 that includes the power7/VSX support: 2009-10-15 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/23983 * config/rs6000/predicates.md: Update copyright year. * config/rs6000/altivec.md: Ditto. * config/rs6000/t-rs6000 (TM_H): Add rs6000-builtin.def. (MD_INCLUDES): Add a2.md. * config/rs6000/rs6000.c (rs6000_builtin_decls): Change RS6000_BUILTIN_COUNT to MAX_RS6000_BUILTINS. (builtin_classify): New static vector to classify various builtins to get the tree attributes correct. (def_builtin): Set the attributes of builtins based on what the builtin does (i.e. memory operation, floating point, saturation need special attributes, others are pure functions). * config/rs6000/rs6000.h (enum rs6000_btc): New enum to classify the builtins. (enum rs6000_builtins): Include rs6000-builtin.def to define the builtins. Change the end marker to MAX_RS6000_BUILTINS from RS6000_BUILTIN_COUNT. (rs6000_builtin_decls): Change RS6000_BUILTIN_COUNT to MAX_RS6000_BUILTINS. * config/rs6000/rs6000-builtin.def: New file that combines the builtin enumeration name and attributes. 2009-09-24 Michael Meissner <meissner@linux.vnet.ibm.com> * config/rs6000/predicates.md (indexed_or_indirect_operand): Delete VSX load/store with update support. * config/rs6000/rs6000.c (rs6000_legitimate_address_p): Ditto. * config/rs6000/vsx.md (vsx_mov<mode>): Ditto. (vsx_movti): Ditto. (VSX_U): Delete. (VSbit): Ditto. (VStype_load_update): Ditto. (VStype_store_update): Ditto. (vsx_load<VSX_U:mode>_update_<P:mptrsize>): Ditto. (vsx_store<VSX_U:mode>_update_<P:mptrsize>): Ditto. * config/rs6000/rs6000.h (enum rs6000_builtins): Delete VSX load/store with update builtins. 2009-09-17 Revital Eres <eres@il.ibm.com> * config/rs6000/rs6000.c (rs6000_builtin_support_vector_misalignment): New function. (TARGET_SUPPORT_VECTOR_MISALIGNMENT): Define. 2009-09-14 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/41210 * config/rs6000/rs6000.c (rs6000_function_value): V2DF and V2DI are returned in the same register (vs34 or v2) that Altivec vector types are returned in. (rs6000_libcall_value): Ditto. PR target/41331 * config/rs6000/rs6000.c (rs6000_emit_move): Use gen_add3_insn instead of explicit addsi3/adddi3 calls. (rs6000_split_multireg_move): Ditto. (rs6000_emit_allocate_stack): Ditto. (rs6000_emit_prologue): Ditto. (rs6000_output_mi_thunk): Ditto. * config/rs6000/rs6000.md (bswapdi*): Don't assume the pointer size is 64 bits if we can use 64-bit registers. 2009-08-23 Alan Modra <amodra@bigpond.net.au> PR target/41081 * config/rs6000/rs6000.md (rotlsi3_64, ashlsi3_64, lshrsi3_64, ashrsi3_64): New. 2009-08-21 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/40671 * config/rs6000/rs6000.c (rs6000_override_options): Use TARGET_64BIT instead of TARGET_POWERPC64 to set the size of pointers. PR target/41145 * config/rs6000/rs6000.c (rs6000_handle_altivec_attribute): Fix reporting of vector + decimal/boolean/complex error. 2009-08-21 Jakub Jelinek <jakub@redhat.com> * config/rs6000/rs6000.c (rs6000_init_builtins): Fix type of __vector double TYPE_DECL. 2009-07-30 Michael Meissner <meissner@linux.vnet.ibm.com> Pat Haugen <pthaugen@us.ibm.com> Revital Eres <ERES@il.ibm.com> * config/rs6000/vector.md (VEC_F): Add VSX support. (VEC_A): Ditto. (VEC_N): Ditto. (mov<mode>): Ditto. (vector_load_<mode>): Ditto. (vector_store_<mode>): Ditto. (vector GPR move split): Ditto. (vec_reload_and_plus_<mptrsize>): Ditto. (vec_reload_and_reg_<mptrsize>): Ditto. (add<mode>3): Ditto. (sub<mode>3): Ditto. (mul<mode>3): Ditto. (neg<mode>2): Ditto. (abs<mode>2): Ditto. (smin<mode>3): Ditto. (smax<mode>3): Ditto. (vector_eq<mode>): Ditto. (vector_gt<mode>): Ditto. (vector_ge<mode>): Ditto. (vector_gtu<mode>): Ditto. (vector_select_<mode>_uns): Ditto. (vector_eq_<mode>_p): Ditto. (vector_gt_<mode>_p): Ditto. (vector_ge_<mode>_p): Ditto. (vector_gtu_<mode>_p): Ditto. (cr6_test_for_zero): Ditto. (cr6_test_for_zero_reverse): Ditto. (cr6_test_for_lt): Ditto. (cr6_test_for_lt_reverse): Ditto. (xor<mode>3): Ditto. (ior<mode>3): Ditto. (and<mode>3): Ditto. (one_cmpl<mode>2): Ditto. (nor<mode>2): Ditto. (andc<mode>2): Ditto. (float<VEC_int<mode>2): Ditto. (unsigned_float<VEC_int><mode>2): Ditto. (fix_trunc<mode><VEC_int>2): Ditto. (fixuns_trunc<mode><VEC_int>2): Ditto. (vec_init<mode>): (vec_set<mode>): Ditto. (vec_extract<mode>): Ditto. (vec_interleave_highv4sf): Ditto. (vec_interleave_lowv4sf): Ditto. (vec_realign_load_<mode>): Ditto. (vec_shl_<mode>): Ditto. (vec_shr_<mode>): Ditto. (div<mode>3): New patterns for VSX. (vec_interleave_highv2df): Ditto. (vec_interleave_lowv2df): Ditto. (vec_pack_trunc_v2df): Ditto. (vec_pack_sfix_trunc_v2df): Ditto. (vec_pack_ufix_trunc_v2df): Ditto. (vec_unpacks_hi_v4sf): Ditto. (vec_unpacks_lo_v4sf): Ditto. (vec_unpacks_float_hi_v4si): Ditto. (vec_unpacks_float_lo_v4si): Ditto. (vec_unpacku_float_hi_v4si): Ditto. (vec_unpacku_float_lo_v4si): Ditto. (movmisalign<mode>): Ditto. (vector_ceil<mode>2): New patterns for vectorizing math library. (vector_floor<mode>2): Ditto. (vector_btrunc<mode>2): Ditto. (vector_copysign<mode>3): Ditto. * config/rs6000/predicates.md (easy_vector_constant_msb): New predicate for setting the high bit in each word, used for copysign. * config/rs6000/ppc-asm.h (f19): Whitespace. (f32-f63): Define if VSX. (v0-v31): Define if Altivec. (vs0-vs63): Define if VSX. * config/rs6000/t-rs6000 (MD_INCLUDES): Add power7.md and vsx.md. * config/rs6000/power7.md: New file, provide tuning parameters for -mcpu=power7. * config/rs6000/rs6000-c.c (rs6000_macro_to_expand): Add VSX support. (rs6000_cpu_cpp_builtins): Ditto. (altivec_overloaded_builtins): Ditto. (altivec_resolve_overloaded_builtin): Ditto. * config/rs6000/rs6000.opt (-mno-vectorize-builtins): Add new debug switch to disable vectorizing simple math builtin functions. * config/rs6000/rs6000.c (rs6000_builtin_vectorized_function): Vectorize simple math builtin functions. (TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION): Define target hook to vectorize math builtins. (rs6000_override_options): Enable -mvsx on -mcpu=power7. (rs6000_builtin_conversion): Add VSX/power7 support. (rs6000_builtin_vec_perm): Ditto. (vsplits_constant): Add support for loading up a vector constant with just the high bit set in each part. (rs6000_expand_vector_init): Add VSX/power7 support. (rs6000_expand_vector_set): Ditto. (rs6000_expand_vector_extract): Ditto. (rs6000_emit_move): Ditto. (bdesc_3arg): Ditto. (bdesc_2arg): Ditto. (bdesc_1arg): Ditto. (rs6000_expand_ternop_builtin): Ditto. (altivec_expand_builtin): Ditto. (rs6000_expand_unop_builtin): Ditto. (rs6000_init_builtins): Ditto. (altivec_init_builtins): Ditto. (builtin_function_type): Ditto. (rs6000_common_init_builtins): Ditto. (rs6000_handle_altivec_attribute); Ditto. (rs6000_mangle_type): Ditto. (rs6000_vector_mode_supported_p): Ditto. (rs6000_mode_dependent_address): Altivec addresses with AND -16 are mode dependent. * config/rs6000/vsx.md: New file for VSX support. * config/rs6000/rs6000.h (EASY_VECTOR_MSB): New macro for identifing values with just the most significant bit set. (enum rs6000_builtins): Add builtins for VSX. Add simple math vectorized builtins. * config/rs6000/altivec.md (UNSPEC_VRFIP): Delete. (UNSPEC_VRFIM): Delete. (splitter for loading up vector with most significant bit): New splitter for vectorizing copysign. (altivec_vrfiz): Rename from altivec_fturncv4sf2. Add support for vectorizing simple math functions. (altivec_vrfip): Add support for vectorizing simple math functions. (altivec_vrfim): Ditto. (altivec_copysign_v4sf3): New insn for Altivec copysign support. * config/rs6000/rs6000.md (UNSPEC_BPERM): New constant. (power7.md, vsx.md): Include for power7 support. (copysigndf3): Use VSX instructions if -mvsx. (negdf2_fpr): Ditto. (absdf2_fpr): Ditto. (nabsdf2_fpr): Ditto. (adddf3_fpr): Ditto. (subdf3_fpr): Ditto. (muldf3_fpr): Ditto. (divdf3_fpr): Ditto. (fix_truncdfdi2_fpr): Ditto. (cmpdf_internal1): Ditto. (fred, fred_fpr): Convert into expander/insn to add VSX support. (btruncdf2, btruncdf2_fpr): Ditto. (ceildf2, ceildf2_fpr): Ditto. (floordf2, floordf2_fpr): Ditto. (floatdidf2, floatdidf2_fpr): Ditto. (fmadddf4_fpr): Name insn. Use VSX instructions if -mvsx. (fmsubdf4_fpr): Ditto. (fnmadddf4_fpr_1): Ditto. (fnmadddf4_fpr_2): Ditto. (fnmsubdf4_fpr_1): Ditto. (fnmsubdf4_fpr_2): Ditto. (fixuns_truncdfdi2): Add expander for VSX support. (fix_truncdfdi2): Ditto. (fix_truncdfsi2): Ditto. (ftruncdf2): Ditto. (btruncsf2): Whitespace. (movdf_hardfloat32): Add support for VSX registers. (movdf_softfloat32): Ditto. (movdf_hardfloat64): Ditto. (movdf_hardfloat64_mfpgpr): Ditto. (movdf_softfloat64): Ditto. (movti splitters): Add check for vector registers supporting TImode in the future. (bpermd): Add power7 bpermd instruction. * config/rs6000/altivec.h (vec_div): Define if VSX. (vec_mul): Ditto. (vec_msub): Ditto. (vec_nmadd): Ditto. (vec_nearbyint): Ditto. (vec_rint): Ditto. (vec_sqrt): Ditto. (all predicates): Use the generic builtin function, and not the V4SF specific function so that the predicates will work with VSX's V2DF. (vec_all_*): Ditto. (vec_any_*): Ditto. * doc/extend.texi (PowerPC Altivec/VSX Built-in Functions): Document new VSX functions and types. * doc/invoke.texi (PowerPc options): Document -mpopcntd, -mvsx switches. * doc/md.texi (PowerPC constraints): Document "wd", "wf", "ws", "wa", and "j" constraints. Modify "v" to talk about Altivec instead of just vector. Backport from 4.3 branch: 2009-09-25 Alan Modra <amodra@bigpond.net.au> * config/rs6000/rs6000.md (load_toc_v4_PIC_3c): Correct POWER form of instruction. 2009-09-23 Alan Modra <amodra@bigpond.net.au> PR target/40473 * config/rs6000/rs6000.c (rs6000_output_function_prologue): Don't call final to emit non-scheduled prologue, instead insert at entry. Index: gcc-4.3.4-20091019/gcc/config/rs6000/aix53.h =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/aix53.h 2008-02-19 10:55:52.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/aix53.h 2009-10-19 13:40:37.000000000 +0200 @@ -1,6 +1,6 @@ /* Definitions of target machine for GNU compiler, for IBM RS/6000 POWER running AIX V5.3. - Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 + Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2009 Free Software Foundation, Inc. Contributed by David Edelsohn (edelsohn@gnu.org). @@ -57,20 +57,24 @@ do { \ #undef ASM_SPEC #define ASM_SPEC "-u %{maix64:-a64 %{!mcpu*:-mppc64}} %(asm_cpu)" -/* Common ASM definitions used by ASM_SPEC amongst the various targets - for handling -mcpu=xxx switches. */ +/* Common ASM definitions used by ASM_SPEC amongst the various targets for + handling -mcpu=xxx switches. There is a parallel list in driver-rs6000.c to + provide the default assembler options if the user uses -mcpu=native, so if + you make changes here, make them there also. */ #undef ASM_CPU_SPEC #define ASM_CPU_SPEC \ "%{!mcpu*: %{!maix64: \ %{mpowerpc64: -mppc64} \ %{maltivec: -m970} \ %{!maltivec: %{!mpower64: %(asm_default)}}}} \ +%{mcpu=native: %(asm_cpu_native)} \ %{mcpu=power3: -m620} \ %{mcpu=power4: -mpwr4} \ %{mcpu=power5: -mpwr5} \ %{mcpu=power5+: -mpwr5x} \ %{mcpu=power6: -mpwr6} \ %{mcpu=power6x: -mpwr6} \ +%{mcpu=power7: -mpwr7} \ %{mcpu=powerpc: -mppc} \ %{mcpu=rs64a: -mppc} \ %{mcpu=603: -m603} \ Index: gcc-4.3.4-20091019/gcc/config/rs6000/aix61.h =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/aix61.h 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/aix61.h 2009-10-19 13:40:37.000000000 +0200 @@ -1,6 +1,6 @@ /* Definitions of target machine for GNU compiler, - for IBM RS/6000 POWER running AIX V5.3. - Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 + for IBM RS/6000 POWER running AIX V6.1. + Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc. Contributed by David Edelsohn (edelsohn@gnu.org). @@ -57,20 +57,24 @@ do { \ #undef ASM_SPEC #define ASM_SPEC "-u %{maix64:-a64 %{!mcpu*:-mppc64}} %(asm_cpu)" -/* Common ASM definitions used by ASM_SPEC amongst the various targets - for handling -mcpu=xxx switches. */ +/* Common ASM definitions used by ASM_SPEC amongst the various targets for + handling -mcpu=xxx switches. There is a parallel list in driver-rs6000.c to + provide the default assembler options if the user uses -mcpu=native, so if + you make changes here, make them there also. */ #undef ASM_CPU_SPEC #define ASM_CPU_SPEC \ "%{!mcpu*: %{!maix64: \ %{mpowerpc64: -mppc64} \ %{maltivec: -m970} \ %{!maltivec: %{!mpower64: %(asm_default)}}}} \ +%{mcpu=native: %(asm_cpu_native)} \ %{mcpu=power3: -m620} \ %{mcpu=power4: -mpwr4} \ %{mcpu=power5: -mpwr5} \ %{mcpu=power5+: -mpwr5x} \ %{mcpu=power6: -mpwr6} \ %{mcpu=power6x: -mpwr6} \ +%{mcpu=power7: -mpwr7} \ %{mcpu=powerpc: -mppc} \ %{mcpu=rs64a: -mppc} \ %{mcpu=603: -m603} \ Index: gcc-4.3.4-20091019/gcc/config/rs6000/aix.h =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/aix.h 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/aix.h 2009-10-19 13:40:37.000000000 +0200 @@ -1,6 +1,6 @@ /* Definitions of target machine for GNU compiler, for IBM RS/6000 POWER running AIX. - Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007 + Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. This file is part of GCC. @@ -155,14 +155,16 @@ #define LIB_SPEC "%{pg:-L/lib/profiled -L/usr/lib/profiled}\ %{p:-L/lib/profiled -L/usr/lib/profiled} %{!shared:%{g*:-lg}} -lc" +/* Static linking with shared libstdc++ requires libsupc++ as well. */ +#define LIBSTDCXX_STATIC "-lstdc++ -lsupc++" + /* This now supports a natural alignment mode. */ /* AIX word-aligns FP doubles but doubleword-aligns 64-bit ints. */ #define ADJUST_FIELD_ALIGN(FIELD, COMPUTED) \ - (TARGET_ALIGN_NATURAL ? (COMPUTED) : \ - (TYPE_MODE (TREE_CODE (TREE_TYPE (FIELD)) == ARRAY_TYPE \ - ? get_inner_array_type (FIELD) \ - : TREE_TYPE (FIELD)) == DFmode \ - ? MIN ((COMPUTED), 32) : (COMPUTED))) + ((TARGET_ALIGN_NATURAL == 0 \ + && TYPE_MODE (strip_array_types (TREE_TYPE (FIELD))) == DFmode) \ + ? MIN ((COMPUTED), 32) \ + : (COMPUTED)) /* AIX increases natural record alignment to doubleword if the first field is an FP double while the FP fields remain word aligned. */ @@ -202,6 +204,8 @@ /* Define cutoff for using external functions to save floating point. */ #define FP_SAVE_INLINE(FIRST_REG) ((FIRST_REG) == 62 || (FIRST_REG) == 63) +/* And similarly for general purpose registers. */ +#define GP_SAVE_INLINE(FIRST_REG) ((FIRST_REG) < 32) /* __throw will restore its own return address to be the same as the return address of the function that the throw is being made to. Index: gcc-4.3.4-20091019/gcc/config/rs6000/altivec.h =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/altivec.h 2009-10-19 13:39:52.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/config/rs6000/altivec.h 2009-10-19 13:40:37.000000000 +0200 @@ -1,5 +1,5 @@ /* PowerPC AltiVec include file. - Copyright (C) 2002, 2003, 2004, 2005 Free Software Foundation, Inc. + Copyright (C) 2002, 2003, 2004, 2005, 2008, 2009 Free Software Foundation, Inc. Contributed by Aldy Hernandez (aldyh@redhat.com). Rewritten by Paolo Bonzini (bonzini@gnu.org). @@ -7,7 +7,7 @@ GCC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published - by the Free Software Foundation; either version 2, or (at your + by the Free Software Foundation; either version 3, or (at your option) any later version. GCC is distributed in the hope that it will be useful, but WITHOUT @@ -15,17 +15,14 @@ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. - You should have received a copy of the GNU General Public License - along with GCC; see the file COPYING. If not, write to the - Free Software Foundation, 51 Franklin Street, Fifth Floor, Boston, - MA 02110-1301, USA. */ - -/* As a special exception, if you include this header file into source - files compiled by GCC, this header file does not by itself cause - the resulting executable to be covered by the GNU General Public - License. This exception does not however invalidate any other - reasons why the executable file might be covered by the GNU General - Public License. */ + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + <http://www.gnu.org/licenses/>. */ /* Implemented to conform to the specification included in the AltiVec Technology Programming Interface Manual (ALTIVECPIM/D 6/1999 Rev 0). */ @@ -309,6 +306,17 @@ #define vec_splats __builtin_vec_splats #define vec_promote __builtin_vec_promote +#ifdef __VSX__ +/* VSX additions */ +#define vec_div __builtin_vec_div +#define vec_mul __builtin_vec_mul +#define vec_msub __builtin_vec_msub +#define vec_nmadd __builtin_vec_nmadd +#define vec_nearbyint __builtin_vec_nearbyint +#define vec_rint __builtin_vec_rint +#define vec_sqrt __builtin_vec_sqrt +#endif + /* Predicates. For C++, we use templates in order to allow non-parenthesized arguments. For C, instead, we use macros since non-parenthesized arguments were @@ -359,14 +367,14 @@ __altivec_scalar_pred(vec_any_out, __builtin_altivec_vcmpbfp_p (__CR6_EQ_REV, a1, a2)) __altivec_unary_pred(vec_all_nan, - __builtin_altivec_vcmpeqfp_p (__CR6_EQ, a1, a1)) + __builtin_altivec_vcmpeq_p (__CR6_EQ, a1, a1)) __altivec_unary_pred(vec_any_nan, - __builtin_altivec_vcmpeqfp_p (__CR6_LT_REV, a1, a1)) + __builtin_altivec_vcmpeq_p (__CR6_LT_REV, a1, a1)) __altivec_unary_pred(vec_all_numeric, - __builtin_altivec_vcmpeqfp_p (__CR6_LT, a1, a1)) + __builtin_altivec_vcmpeq_p (__CR6_LT, a1, a1)) __altivec_unary_pred(vec_any_numeric, - __builtin_altivec_vcmpeqfp_p (__CR6_EQ_REV, a1, a1)) + __builtin_altivec_vcmpeq_p (__CR6_EQ_REV, a1, a1)) __altivec_scalar_pred(vec_all_eq, __builtin_vec_vcmpeq_p (__CR6_LT, a1, a2)) @@ -387,13 +395,13 @@ __altivec_scalar_pred(vec_any_lt, __builtin_vec_vcmpgt_p (__CR6_EQ_REV, a2, a1)) __altivec_scalar_pred(vec_all_ngt, - __builtin_altivec_vcmpgtfp_p (__CR6_EQ, a1, a2)) + __builtin_altivec_vcmpgt_p (__CR6_EQ, a1, a2)) __altivec_scalar_pred(vec_all_nlt, - __builtin_altivec_vcmpgtfp_p (__CR6_EQ, a2, a1)) + __builtin_altivec_vcmpgt_p (__CR6_EQ, a2, a1)) __altivec_scalar_pred(vec_any_ngt, - __builtin_altivec_vcmpgtfp_p (__CR6_LT_REV, a1, a2)) + __builtin_altivec_vcmpgt_p (__CR6_LT_REV, a1, a2)) __altivec_scalar_pred(vec_any_nlt, - __builtin_altivec_vcmpgtfp_p (__CR6_LT_REV, a2, a1)) + __builtin_altivec_vcmpgt_p (__CR6_LT_REV, a2, a1)) /* __builtin_vec_vcmpge_p is vcmpgefp for floating-point vector types, while for integer types it is converted to __builtin_vec_vcmpgt_p, @@ -408,13 +416,13 @@ __altivec_scalar_pred(vec_any_ge, __builtin_vec_vcmpge_p (__CR6_EQ_REV, a1, a2)) __altivec_scalar_pred(vec_all_nge, - __builtin_altivec_vcmpgefp_p (__CR6_EQ, a1, a2)) + __builtin_altivec_vcmpge_p (__CR6_EQ, a1, a2)) __altivec_scalar_pred(vec_all_nle, - __builtin_altivec_vcmpgefp_p (__CR6_EQ, a2, a1)) + __builtin_altivec_vcmpge_p (__CR6_EQ, a2, a1)) __altivec_scalar_pred(vec_any_nge, - __builtin_altivec_vcmpgefp_p (__CR6_LT_REV, a1, a2)) + __builtin_altivec_vcmpge_p (__CR6_LT_REV, a1, a2)) __altivec_scalar_pred(vec_any_nle, - __builtin_altivec_vcmpgefp_p (__CR6_LT_REV, a2, a1)) + __builtin_altivec_vcmpge_p (__CR6_LT_REV, a2, a1)) #undef __altivec_scalar_pred #undef __altivec_unary_pred @@ -426,11 +434,11 @@ __altivec_scalar_pred(vec_any_nle, #define vec_all_in(a1, a2) __builtin_altivec_vcmpbfp_p (__CR6_EQ, (a1), (a2)) #define vec_any_out(a1, a2) __builtin_altivec_vcmpbfp_p (__CR6_EQ_REV, (a1), (a2)) -#define vec_all_nan(a1) __builtin_altivec_vcmpeqfp_p (__CR6_EQ, (a1), (a1)) -#define vec_any_nan(a1) __builtin_altivec_vcmpeqfp_p (__CR6_LT_REV, (a1), (a1)) +#define vec_all_nan(a1) __builtin_vec_vcmpeq_p (__CR6_EQ, (a1), (a1)) +#define vec_any_nan(a1) __builtin_vec_vcmpeq_p (__CR6_LT_REV, (a1), (a1)) -#define vec_all_numeric(a1) __builtin_altivec_vcmpeqfp_p (__CR6_LT, (a1), (a1)) -#define vec_any_numeric(a1) __builtin_altivec_vcmpeqfp_p (__CR6_EQ_REV, (a1), (a1)) +#define vec_all_numeric(a1) __builtin_vec_vcmpeq_p (__CR6_LT, (a1), (a1)) +#define vec_any_numeric(a1) __builtin_vec_vcmpeq_p (__CR6_EQ_REV, (a1), (a1)) #define vec_all_eq(a1, a2) __builtin_vec_vcmpeq_p (__CR6_LT, (a1), (a2)) #define vec_all_ne(a1, a2) __builtin_vec_vcmpeq_p (__CR6_EQ, (a1), (a2)) @@ -442,10 +450,10 @@ __altivec_scalar_pred(vec_any_nle, #define vec_any_gt(a1, a2) __builtin_vec_vcmpgt_p (__CR6_EQ_REV, (a1), (a2)) #define vec_any_lt(a1, a2) __builtin_vec_vcmpgt_p (__CR6_EQ_REV, (a2), (a1)) -#define vec_all_ngt(a1, a2) __builtin_altivec_vcmpgtfp_p (__CR6_EQ, (a1), (a2)) -#define vec_all_nlt(a1, a2) __builtin_altivec_vcmpgtfp_p (__CR6_EQ, (a2), (a1)) -#define vec_any_ngt(a1, a2) __builtin_altivec_vcmpgtfp_p (__CR6_LT_REV, (a1), (a2)) -#define vec_any_nlt(a1, a2) __builtin_altivec_vcmpgtfp_p (__CR6_LT_REV, (a2), (a1)) +#define vec_all_ngt(a1, a2) __builtin_vec_vcmpgt_p (__CR6_EQ, (a1), (a2)) +#define vec_all_nlt(a1, a2) __builtin_vec_vcmpgt_p (__CR6_EQ, (a2), (a1)) +#define vec_any_ngt(a1, a2) __builtin_vec_vcmpgt_p (__CR6_LT_REV, (a1), (a2)) +#define vec_any_nlt(a1, a2) __builtin_vec_vcmpgt_p (__CR6_LT_REV, (a2), (a1)) /* __builtin_vec_vcmpge_p is vcmpgefp for floating-point vector types, while for integer types it is converted to __builtin_vec_vcmpgt_p, @@ -455,10 +463,10 @@ __altivec_scalar_pred(vec_any_nle, #define vec_any_le(a1, a2) __builtin_vec_vcmpge_p (__CR6_EQ_REV, (a2), (a1)) #define vec_any_ge(a1, a2) __builtin_vec_vcmpge_p (__CR6_EQ_REV, (a1), (a2)) -#define vec_all_nge(a1, a2) __builtin_altivec_vcmpgefp_p (__CR6_EQ, (a1), (a2)) -#define vec_all_nle(a1, a2) __builtin_altivec_vcmpgefp_p (__CR6_EQ, (a2), (a1)) -#define vec_any_nge(a1, a2) __builtin_altivec_vcmpgefp_p (__CR6_LT_REV, (a1), (a2)) -#define vec_any_nle(a1, a2) __builtin_altivec_vcmpgefp_p (__CR6_LT_REV, (a2), (a1)) +#define vec_all_nge(a1, a2) __builtin_vec_vcmpge_p (__CR6_EQ, (a1), (a2)) +#define vec_all_nle(a1, a2) __builtin_vec_vcmpge_p (__CR6_EQ, (a2), (a1)) +#define vec_any_nge(a1, a2) __builtin_vec_vcmpge_p (__CR6_LT_REV, (a1), (a2)) +#define vec_any_nle(a1, a2) __builtin_vec_vcmpge_p (__CR6_LT_REV, (a2), (a1)) #endif /* These do not accept vectors, so they do not have a __builtin_vec_* Index: gcc-4.3.4-20091019/gcc/config/rs6000/altivec.md =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/altivec.md 2009-10-19 13:39:52.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/config/rs6000/altivec.md 2009-10-19 13:40:37.000000000 +0200 @@ -20,19 +20,8 @@ ;; <http://www.gnu.org/licenses/>. (define_constants - [(UNSPEC_VCMPBFP 50) - (UNSPEC_VCMPEQUB 51) - (UNSPEC_VCMPEQUH 52) - (UNSPEC_VCMPEQUW 53) - (UNSPEC_VCMPEQFP 54) - (UNSPEC_VCMPGEFP 55) - (UNSPEC_VCMPGTUB 56) - (UNSPEC_VCMPGTSB 57) - (UNSPEC_VCMPGTUH 58) - (UNSPEC_VCMPGTSH 59) - (UNSPEC_VCMPGTUW 60) - (UNSPEC_VCMPGTSW 61) - (UNSPEC_VCMPGTFP 62) + ;; 51-62 deleted + [(UNSPEC_VCMPBFP 64) (UNSPEC_VMSUMU 65) (UNSPEC_VMSUMM 66) (UNSPEC_VMSUMSHM 68) @@ -63,7 +52,7 @@ (UNSPEC_VPKSHUS 101) (UNSPEC_VPKUWUS 102) (UNSPEC_VPKSWUS 103) - (UNSPEC_VRL 104) + ;; 104 deleted (UNSPEC_VSLV4SI 110) (UNSPEC_VSLO 111) (UNSPEC_VSR 118) @@ -76,9 +65,10 @@ (UNSPEC_VSUM2SWS 134) (UNSPEC_VSUMSWS 135) (UNSPEC_VPERM 144) - (UNSPEC_VRFIP 148) + (UNSPEC_VPERM_UNS 145) + ;; 148 deleted (UNSPEC_VRFIN 149) - (UNSPEC_VRFIM 150) + ;; 150 deleted (UNSPEC_VCFUX 151) (UNSPEC_VCFSX 152) (UNSPEC_VCTUXS 153) @@ -87,10 +77,7 @@ (UNSPEC_VEXPTEFP 156) (UNSPEC_VRSQRTEFP 157) (UNSPEC_VREFP 158) - (UNSPEC_VSEL4SI 159) - (UNSPEC_VSEL4SF 160) - (UNSPEC_VSEL8HI 161) - (UNSPEC_VSEL16QI 162) + ;; 159-162 deleted (UNSPEC_VLSDOI 163) (UNSPEC_VUPKHSB 167) (UNSPEC_VUPKHPX 168) @@ -98,7 +85,7 @@ (UNSPEC_VUPKLSB 170) (UNSPEC_VUPKLPX 171) (UNSPEC_VUPKLSH 172) - (UNSPEC_PREDICATE 173) + ;; 173 deleted (UNSPEC_DST 190) (UNSPEC_DSTT 191) (UNSPEC_DSTST 192) @@ -111,7 +98,7 @@ (UNSPEC_STVE 203) (UNSPEC_SET_VSCR 213) (UNSPEC_GET_VRSAVE 214) - (UNSPEC_REALIGN_LOAD 215) + ;; 215 deleted (UNSPEC_REDUC_PLUS 217) (UNSPEC_VECSH 219) (UNSPEC_EXTEVEN_V4SI 220) @@ -125,11 +112,11 @@ (UNSPEC_INTERHI_V4SI 228) (UNSPEC_INTERHI_V8HI 229) (UNSPEC_INTERHI_V16QI 230) - (UNSPEC_INTERHI_V4SF 231) + ;; delete 231 (UNSPEC_INTERLO_V4SI 232) (UNSPEC_INTERLO_V8HI 233) (UNSPEC_INTERLO_V16QI 234) - (UNSPEC_INTERLO_V4SF 235) + ;; delete 235 (UNSPEC_LVLX 236) (UNSPEC_LVLXL 237) (UNSPEC_LVRX 238) @@ -176,39 +163,22 @@ (define_mode_iterator VF [V4SF]) ;; Vec modes, pity mode iterators are not composable (define_mode_iterator V [V4SI V8HI V16QI V4SF]) +;; Vec modes for move/logical/permute ops, include vector types for move not +;; otherwise handled by altivec (v2df, v2di, ti) +(define_mode_iterator VM [V4SI V8HI V16QI V4SF V2DF V2DI TI]) -(define_mode_attr VI_char [(V4SI "w") (V8HI "h") (V16QI "b")]) - -;; Generic LVX load instruction. -(define_insn "altivec_lvx_<mode>" - [(set (match_operand:V 0 "altivec_register_operand" "=v") - (match_operand:V 1 "memory_operand" "Z"))] - "TARGET_ALTIVEC" - "lvx %0,%y1" - [(set_attr "type" "vecload")]) +;; Like VM, except don't do TImode +(define_mode_iterator VM2 [V4SI V8HI V16QI V4SF V2DF V2DI]) -;; Generic STVX store instruction. -(define_insn "altivec_stvx_<mode>" - [(set (match_operand:V 0 "memory_operand" "=Z") - (match_operand:V 1 "altivec_register_operand" "v"))] - "TARGET_ALTIVEC" - "stvx %1,%y0" - [(set_attr "type" "vecstore")]) +(define_mode_attr VI_char [(V4SI "w") (V8HI "h") (V16QI "b")]) ;; Vector move instructions. -(define_expand "mov<mode>" - [(set (match_operand:V 0 "nonimmediate_operand" "") - (match_operand:V 1 "any_operand" ""))] - "TARGET_ALTIVEC" -{ - rs6000_emit_move (operands[0], operands[1], <MODE>mode); - DONE; -}) - -(define_insn "*mov<mode>_internal" - [(set (match_operand:V 0 "nonimmediate_operand" "=Z,v,v,o,r,r,v") - (match_operand:V 1 "input_operand" "v,Z,v,r,o,r,W"))] - "TARGET_ALTIVEC +;; Use 'Q' for gpr moves to force the address to a single register, from which we can do the +;; split and create offsetable addresses for each word +(define_insn "*altivec_mov<mode>" + [(set (match_operand:VM2 0 "nonimmediate_operand" "=Z,v,v,*Q,*r,*r,v,v") + (match_operand:VM2 1 "input_operand" "v,Z,v,r,Q,r,j,W"))] + "VECTOR_MEM_ALTIVEC_P (<MODE>mode) && (register_operand (operands[0], <MODE>mode) || register_operand (operands[1], <MODE>mode))" { @@ -220,52 +190,71 @@ case 3: return "#"; case 4: return "#"; case 5: return "#"; - case 6: return output_vec_const_move (operands); + case 6: return "vxor %0,%0,%0"; + case 7: return output_vec_const_move (operands); default: gcc_unreachable (); } } - [(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,*")]) + [(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,vecsimple,*")]) -(define_split - [(set (match_operand:V4SI 0 "nonimmediate_operand" "") - (match_operand:V4SI 1 "input_operand" ""))] - "TARGET_ALTIVEC && reload_completed - && gpr_or_gpr_p (operands[0], operands[1])" - [(pc)] +;; Unlike other altivec moves, allow the GPRs, since a normal use of TImode +;; is for unions. However for plain data movement, slightly favor the vector +;; loads +(define_insn "*altivec_movti" + [(set (match_operand:TI 0 "nonimmediate_operand" "=Z,v,v,?o,?r,?r,v,v") + (match_operand:TI 1 "input_operand" "v,Z,v,r,o,r,j,W"))] + "VECTOR_MEM_ALTIVEC_P (TImode) + && (register_operand (operands[0], TImode) + || register_operand (operands[1], TImode))" { - rs6000_split_multireg_move (operands[0], operands[1]); DONE; -}) + switch (which_alternative) + { + case 0: return "stvx %1,%y0"; + case 1: return "lvx %0,%y1"; + case 2: return "vor %0,%1,%1"; + case 3: return "#"; + case 4: return "#"; + case 5: return "#"; + case 6: return "vxor %0,%0,%0"; + case 7: return output_vec_const_move (operands); + default: gcc_unreachable (); + } +} + [(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,vecsimple,*")]) +;; Load up a vector with the most significant bit set by loading up -1 and +;; doing a shift left (define_split - [(set (match_operand:V8HI 0 "nonimmediate_operand" "") - (match_operand:V8HI 1 "input_operand" ""))] - "TARGET_ALTIVEC && reload_completed - && gpr_or_gpr_p (operands[0], operands[1])" - [(pc)] -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }) + [(set (match_operand:VM 0 "altivec_register_operand" "") + (match_operand:VM 1 "easy_vector_constant_msb" ""))] + "VECTOR_UNIT_ALTIVEC_P (<MODE>mode) && reload_completed" + [(const_int 0)] +{ + rtx dest = operands[0]; + enum machine_mode mode = GET_MODE (operands[0]); + rtvec v; + int i, num_elements; -(define_split - [(set (match_operand:V16QI 0 "nonimmediate_operand" "") - (match_operand:V16QI 1 "input_operand" ""))] - "TARGET_ALTIVEC && reload_completed - && gpr_or_gpr_p (operands[0], operands[1])" - [(pc)] -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }) + if (mode == V4SFmode) + { + mode = V4SImode; + dest = gen_lowpart (V4SImode, dest); + } -(define_split - [(set (match_operand:V4SF 0 "nonimmediate_operand" "") - (match_operand:V4SF 1 "input_operand" ""))] - "TARGET_ALTIVEC && reload_completed - && gpr_or_gpr_p (operands[0], operands[1])" - [(pc)] -{ - rs6000_split_multireg_move (operands[0], operands[1]); DONE; + num_elements = GET_MODE_NUNITS (mode); + v = rtvec_alloc (num_elements); + for (i = 0; i < num_elements; i++) + RTVEC_ELT (v, i) = constm1_rtx; + + emit_insn (gen_vec_initv4si (dest, gen_rtx_PARALLEL (mode, v))); + emit_insn (gen_rtx_SET (VOIDmode, dest, gen_rtx_ASHIFT (mode, dest, dest))); + DONE; }) (define_split - [(set (match_operand:V 0 "altivec_register_operand" "") - (match_operand:V 1 "easy_vector_constant_add_self" ""))] - "TARGET_ALTIVEC && reload_completed" + [(set (match_operand:VM 0 "altivec_register_operand" "") + (match_operand:VM 1 "easy_vector_constant_add_self" ""))] + "VECTOR_UNIT_ALTIVEC_P (<MODE>mode) && reload_completed" [(set (match_dup 0) (match_dup 3)) (set (match_dup 0) (match_dup 4))] { @@ -346,11 +335,11 @@ "vaddu<VI_char>m %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "addv4sf3" +(define_insn "*altivec_addv4sf3" [(set (match_operand:V4SF 0 "register_operand" "=v") (plus:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vaddfp %0,%1,%2" [(set_attr "type" "vecfloat")]) @@ -392,11 +381,11 @@ "vsubu<VI_char>m %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "subv4sf3" +(define_insn "*altivec_subv4sf3" [(set (match_operand:V4SF 0 "register_operand" "=v") (minus:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vsubfp %0,%1,%2" [(set_attr "type" "vecfloat")]) @@ -457,131 +446,93 @@ "vcmpbfp %0,%1,%2" [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpequb" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v")] - UNSPEC_VCMPEQUB))] +(define_insn "*altivec_eq<mode>" + [(set (match_operand:VI 0 "altivec_register_operand" "=v") + (eq:VI (match_operand:VI 1 "altivec_register_operand" "v") + (match_operand:VI 2 "altivec_register_operand" "v")))] "TARGET_ALTIVEC" - "vcmpequb %0,%1,%2" - [(set_attr "type" "vecsimple")]) + "vcmpequ<VI_char> %0,%1,%2" + [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpequh" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v")] - UNSPEC_VCMPEQUH))] +(define_insn "*altivec_gt<mode>" + [(set (match_operand:VI 0 "altivec_register_operand" "=v") + (gt:VI (match_operand:VI 1 "altivec_register_operand" "v") + (match_operand:VI 2 "altivec_register_operand" "v")))] "TARGET_ALTIVEC" - "vcmpequh %0,%1,%2" - [(set_attr "type" "vecsimple")]) + "vcmpgts<VI_char> %0,%1,%2" + [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpequw" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v")] - UNSPEC_VCMPEQUW))] +(define_insn "*altivec_gtu<mode>" + [(set (match_operand:VI 0 "altivec_register_operand" "=v") + (gtu:VI (match_operand:VI 1 "altivec_register_operand" "v") + (match_operand:VI 2 "altivec_register_operand" "v")))] "TARGET_ALTIVEC" - "vcmpequw %0,%1,%2" - [(set_attr "type" "vecsimple")]) + "vcmpgtu<VI_char> %0,%1,%2" + [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpeqfp" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")] - UNSPEC_VCMPEQFP))] - "TARGET_ALTIVEC" +(define_insn "*altivec_eqv4sf" + [(set (match_operand:V4SF 0 "altivec_register_operand" "=v") + (eq:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v") + (match_operand:V4SF 2 "altivec_register_operand" "v")))] + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vcmpeqfp %0,%1,%2" [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpgefp" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")] - UNSPEC_VCMPGEFP))] - "TARGET_ALTIVEC" - "vcmpgefp %0,%1,%2" +(define_insn "*altivec_gtv4sf" + [(set (match_operand:V4SF 0 "altivec_register_operand" "=v") + (gt:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v") + (match_operand:V4SF 2 "altivec_register_operand" "v")))] + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" + "vcmpgtfp %0,%1,%2" [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpgtub" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v")] - UNSPEC_VCMPGTUB))] - "TARGET_ALTIVEC" - "vcmpgtub %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtsb" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v")] - UNSPEC_VCMPGTSB))] - "TARGET_ALTIVEC" - "vcmpgtsb %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtuh" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v")] - UNSPEC_VCMPGTUH))] - "TARGET_ALTIVEC" - "vcmpgtuh %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtsh" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v")] - UNSPEC_VCMPGTSH))] - "TARGET_ALTIVEC" - "vcmpgtsh %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtuw" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v")] - UNSPEC_VCMPGTUW))] - "TARGET_ALTIVEC" - "vcmpgtuw %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtsw" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v")] - UNSPEC_VCMPGTSW))] - "TARGET_ALTIVEC" - "vcmpgtsw %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtfp" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")] - UNSPEC_VCMPGTFP))] - "TARGET_ALTIVEC" - "vcmpgtfp %0,%1,%2" +(define_insn "*altivec_gev4sf" + [(set (match_operand:V4SF 0 "altivec_register_operand" "=v") + (ge:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v") + (match_operand:V4SF 2 "altivec_register_operand" "v")))] + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" + "vcmpgefp %0,%1,%2" [(set_attr "type" "veccmp")]) +(define_insn "*altivec_vsel<mode>" + [(set (match_operand:VM 0 "altivec_register_operand" "=v") + (if_then_else:VM + (ne:CC (match_operand:VM 1 "altivec_register_operand" "v") + (const_int 0)) + (match_operand:VM 2 "altivec_register_operand" "v") + (match_operand:VM 3 "altivec_register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (<MODE>mode)" + "vsel %0,%3,%2,%1" + [(set_attr "type" "vecperm")]) + +(define_insn "*altivec_vsel<mode>_uns" + [(set (match_operand:VM 0 "altivec_register_operand" "=v") + (if_then_else:VM + (ne:CCUNS (match_operand:VM 1 "altivec_register_operand" "v") + (const_int 0)) + (match_operand:VM 2 "altivec_register_operand" "v") + (match_operand:VM 3 "altivec_register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (<MODE>mode)" + "vsel %0,%3,%2,%1" + [(set_attr "type" "vecperm")]) + ;; Fused multiply add (define_insn "altivec_vmaddfp" [(set (match_operand:V4SF 0 "register_operand" "=v") (plus:V4SF (mult:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")) (match_operand:V4SF 3 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vmaddfp %0,%1,%2,%3" [(set_attr "type" "vecfloat")]) ;; We do multiply as a fused multiply-add with an add of a -0.0 vector. -(define_expand "mulv4sf3" +(define_expand "altivec_mulv4sf3" [(use (match_operand:V4SF 0 "register_operand" "")) (use (match_operand:V4SF 1 "register_operand" "")) (use (match_operand:V4SF 2 "register_operand" ""))] - "TARGET_ALTIVEC && TARGET_FUSED_MADD" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode) && TARGET_FUSED_MADD" " { rtx neg0; @@ -631,7 +582,7 @@ emit_insn (gen_altivec_vspltisw (sixteen, gen_rtx_CONST_INT (V4SImode, -16))); swap = gen_reg_rtx (V4SImode); - emit_insn (gen_altivec_vrlw (swap, operands[2], sixteen)); + emit_insn (gen_vrotlv4si3 (swap, operands[2], sixteen)); one = gen_reg_rtx (V8HImode); convert_move (one, operands[1], 0); @@ -655,6 +606,28 @@ DONE; }") +(define_expand "mulv8hi3" + [(use (match_operand:V8HI 0 "register_operand" "")) + (use (match_operand:V8HI 1 "register_operand" "")) + (use (match_operand:V8HI 2 "register_operand" ""))] + "TARGET_ALTIVEC" + " +{ + rtx odd = gen_reg_rtx (V4SImode); + rtx even = gen_reg_rtx (V4SImode); + rtx high = gen_reg_rtx (V4SImode); + rtx low = gen_reg_rtx (V4SImode); + + emit_insn (gen_altivec_vmulesh (even, operands[1], operands[2])); + emit_insn (gen_altivec_vmulosh (odd, operands[1], operands[2])); + + emit_insn (gen_altivec_vmrghw (high, even, odd)); + emit_insn (gen_altivec_vmrglw (low, even, odd)); + + emit_insn (gen_altivec_vpkuwum (operands[0], high, low)); + + DONE; +}") ;; Fused multiply subtract (define_insn "altivec_vnmsubfp" @@ -662,7 +635,7 @@ (neg:V4SF (minus:V4SF (mult:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")) (match_operand:V4SF 3 "register_operand" "v"))))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vnmsubfp %0,%1,%2,%3" [(set_attr "type" "vecfloat")]) @@ -736,11 +709,11 @@ "vmaxs<VI_char> %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "smaxv4sf3" +(define_insn "*altivec_smaxv4sf3" [(set (match_operand:V4SF 0 "register_operand" "=v") (smax:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vmaxfp %0,%1,%2" [(set_attr "type" "veccmp")]) @@ -760,11 +733,11 @@ "vmins<VI_char> %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "sminv4sf3" +(define_insn "*altivec_sminv4sf3" [(set (match_operand:V4SF 0 "register_operand" "=v") (smin:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vminfp %0,%1,%2" [(set_attr "type" "veccmp")]) @@ -879,11 +852,11 @@ (const_int 3) (const_int 1)])) (const_int 5)))] - "TARGET_ALTIVEC" + "VECTOR_MEM_ALTIVEC_P (V4SImode)" "vmrghw %0,%1,%2" [(set_attr "type" "vecperm")]) -(define_insn "altivec_vmrghsf" +(define_insn "*altivec_vmrghsf" [(set (match_operand:V4SF 0 "register_operand" "=v") (vec_merge:V4SF (vec_select:V4SF (match_operand:V4SF 1 "register_operand" "v") (parallel [(const_int 0) @@ -896,7 +869,7 @@ (const_int 3) (const_int 1)])) (const_int 5)))] - "TARGET_ALTIVEC" + "VECTOR_MEM_ALTIVEC_P (V4SFmode)" "vmrghw %0,%1,%2" [(set_attr "type" "vecperm")]) @@ -968,35 +941,37 @@ (define_insn "altivec_vmrglw" [(set (match_operand:V4SI 0 "register_operand" "=v") - (vec_merge:V4SI (vec_select:V4SI (match_operand:V4SI 1 "register_operand" "v") - (parallel [(const_int 2) - (const_int 0) - (const_int 3) - (const_int 1)])) - (vec_select:V4SI (match_operand:V4SI 2 "register_operand" "v") - (parallel [(const_int 0) - (const_int 2) - (const_int 1) - (const_int 3)])) - (const_int 5)))] - "TARGET_ALTIVEC" + (vec_merge:V4SI + (vec_select:V4SI (match_operand:V4SI 1 "register_operand" "v") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (vec_select:V4SI (match_operand:V4SI 2 "register_operand" "v") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (const_int 5)))] + "VECTOR_MEM_ALTIVEC_P (V4SImode)" "vmrglw %0,%1,%2" [(set_attr "type" "vecperm")]) -(define_insn "altivec_vmrglsf" +(define_insn "*altivec_vmrglsf" [(set (match_operand:V4SF 0 "register_operand" "=v") - (vec_merge:V4SF (vec_select:V4SF (match_operand:V4SF 1 "register_operand" "v") - (parallel [(const_int 2) - (const_int 0) - (const_int 3) - (const_int 1)])) - (vec_select:V4SF (match_operand:V4SF 2 "register_operand" "v") - (parallel [(const_int 0) - (const_int 2) - (const_int 1) - (const_int 3)])) - (const_int 5)))] - "TARGET_ALTIVEC" + (vec_merge:V4SF + (vec_select:V4SF (match_operand:V4SF 1 "register_operand" "v") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (vec_select:V4SF (match_operand:V4SF 2 "register_operand" "v") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (const_int 5)))] + "VECTOR_MEM_ALTIVEC_P (V4SFmode)" "vmrglw %0,%1,%2" [(set_attr "type" "vecperm")]) @@ -1073,68 +1048,53 @@ [(set_attr "type" "veccomplex")]) -;; logical ops +;; logical ops. Have the logical ops follow the memory ops in +;; terms of whether to prefer VSX or Altivec -(define_insn "and<mode>3" - [(set (match_operand:VI 0 "register_operand" "=v") - (and:VI (match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" +(define_insn "*altivec_and<mode>3" + [(set (match_operand:VM 0 "register_operand" "=v") + (and:VM (match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (<MODE>mode)" "vand %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "ior<mode>3" - [(set (match_operand:VI 0 "register_operand" "=v") - (ior:VI (match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" +(define_insn "*altivec_ior<mode>3" + [(set (match_operand:VM 0 "register_operand" "=v") + (ior:VM (match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (<MODE>mode)" "vor %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "xor<mode>3" - [(set (match_operand:VI 0 "register_operand" "=v") - (xor:VI (match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" +(define_insn "*altivec_xor<mode>3" + [(set (match_operand:VM 0 "register_operand" "=v") + (xor:VM (match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (<MODE>mode)" "vxor %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "xorv4sf3" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (xor:V4SF (match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - "vxor %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "one_cmpl<mode>2" - [(set (match_operand:VI 0 "register_operand" "=v") - (not:VI (match_operand:VI 1 "register_operand" "v")))] - "TARGET_ALTIVEC" +(define_insn "*altivec_one_cmpl<mode>2" + [(set (match_operand:VM 0 "register_operand" "=v") + (not:VM (match_operand:VM 1 "register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (<MODE>mode)" "vnor %0,%1,%1" [(set_attr "type" "vecsimple")]) -(define_insn "altivec_nor<mode>3" - [(set (match_operand:VI 0 "register_operand" "=v") - (not:VI (ior:VI (match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v"))))] - "TARGET_ALTIVEC" +(define_insn "*altivec_nor<mode>3" + [(set (match_operand:VM 0 "register_operand" "=v") + (not:VM (ior:VM (match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v"))))] + "VECTOR_MEM_ALTIVEC_P (<MODE>mode)" "vnor %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "andc<mode>3" - [(set (match_operand:VI 0 "register_operand" "=v") - (and:VI (not:VI (match_operand:VI 2 "register_operand" "v")) - (match_operand:VI 1 "register_operand" "v")))] - "TARGET_ALTIVEC" - "vandc %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "*andc3_v4sf" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (and:V4SF (not:V4SF (match_operand:V4SF 2 "register_operand" "v")) - (match_operand:V4SF 1 "register_operand" "v")))] - "TARGET_ALTIVEC" +(define_insn "*altivec_andc<mode>3" + [(set (match_operand:VM 0 "register_operand" "=v") + (and:VM (not:VM (match_operand:VM 2 "register_operand" "v")) + (match_operand:VM 1 "register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (<MODE>mode)" "vandc %0,%1,%2" [(set_attr "type" "vecsimple")]) @@ -1225,11 +1185,10 @@ "vpkswus %0,%1,%2" [(set_attr "type" "vecperm")]) -(define_insn "altivec_vrl<VI_char>" +(define_insn "*altivec_vrl<VI_char>" [(set (match_operand:VI 0 "register_operand" "=v") - (unspec:VI [(match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v")] - UNSPEC_VRL))] + (rotate:VI (match_operand:VI 1 "register_operand" "v") + (match_operand:VI 2 "register_operand" "v")))] "TARGET_ALTIVEC" "vrl<VI_char> %0,%1,%2" [(set_attr "type" "vecsimple")]) @@ -1252,26 +1211,26 @@ "vslo %0,%1,%2" [(set_attr "type" "vecperm")]) -(define_insn "vashl<mode>3" +(define_insn "*altivec_vsl<VI_char>" [(set (match_operand:VI 0 "register_operand" "=v") (ashift:VI (match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v") ))] + (match_operand:VI 2 "register_operand" "v")))] "TARGET_ALTIVEC" "vsl<VI_char> %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "vlshr<mode>3" +(define_insn "*altivec_vsr<VI_char>" [(set (match_operand:VI 0 "register_operand" "=v") (lshiftrt:VI (match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v") ))] + (match_operand:VI 2 "register_operand" "v")))] "TARGET_ALTIVEC" "vsr<VI_char> %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "vashr<mode>3" +(define_insn "*altivec_vsra<VI_char>" [(set (match_operand:VI 0 "register_operand" "=v") (ashiftrt:VI (match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v") ))] + (match_operand:VI 2 "register_operand" "v")))] "TARGET_ALTIVEC" "vsra<VI_char> %0,%1,%2" [(set_attr "type" "vecsimple")]) @@ -1364,13 +1323,13 @@ "vspltw %0,%1,%2" [(set_attr "type" "vecperm")]) -(define_insn "*altivec_vspltsf" +(define_insn "altivec_vspltsf" [(set (match_operand:V4SF 0 "register_operand" "=v") (vec_duplicate:V4SF (vec_select:SF (match_operand:V4SF 1 "register_operand" "v") (parallel [(match_operand:QI 2 "u5bit_cint_operand" "i")]))))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vspltw %0,%1,%2" [(set_attr "type" "vecperm")]) @@ -1382,27 +1341,37 @@ "vspltis<VI_char> %0,%1" [(set_attr "type" "vecperm")]) -(define_insn "ftruncv4sf2" +(define_insn "*altivec_vrfiz" [(set (match_operand:V4SF 0 "register_operand" "=v") (fix:V4SF (match_operand:V4SF 1 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vrfiz %0,%1" [(set_attr "type" "vecfloat")]) (define_insn "altivec_vperm_<mode>" - [(set (match_operand:V 0 "register_operand" "=v") - (unspec:V [(match_operand:V 1 "register_operand" "v") - (match_operand:V 2 "register_operand" "v") - (match_operand:V16QI 3 "register_operand" "v")] - UNSPEC_VPERM))] + [(set (match_operand:VM 0 "register_operand" "=v") + (unspec:VM [(match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v") + (match_operand:V16QI 3 "register_operand" "v")] + UNSPEC_VPERM))] + "TARGET_ALTIVEC" + "vperm %0,%1,%2,%3" + [(set_attr "type" "vecperm")]) + +(define_insn "altivec_vperm_<mode>_uns" + [(set (match_operand:VM 0 "register_operand" "=v") + (unspec:VM [(match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v") + (match_operand:V16QI 3 "register_operand" "v")] + UNSPEC_VPERM_UNS))] "TARGET_ALTIVEC" "vperm %0,%1,%2,%3" [(set_attr "type" "vecperm")]) -(define_insn "altivec_vrfip" +(define_insn "altivec_vrfip" ; ceil [(set (match_operand:V4SF 0 "register_operand" "=v") (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v")] - UNSPEC_VRFIP))] + UNSPEC_FRIP))] "TARGET_ALTIVEC" "vrfip %0,%1" [(set_attr "type" "vecfloat")]) @@ -1415,10 +1384,10 @@ "vrfin %0,%1" [(set_attr "type" "vecfloat")]) -(define_insn "altivec_vrfim" +(define_insn "*altivec_vrfim" ; floor [(set (match_operand:V4SF 0 "register_operand" "=v") (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v")] - UNSPEC_VRFIM))] + UNSPEC_FRIM))] "TARGET_ALTIVEC" "vrfim %0,%1" [(set_attr "type" "vecfloat")]) @@ -1493,185 +1462,33 @@ "vrefp %0,%1" [(set_attr "type" "vecfloat")]) -(define_expand "vcondv4si" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (if_then_else:V4SI - (match_operator 3 "comparison_operator" - [(match_operand:V4SI 4 "register_operand" "v") - (match_operand:V4SI 5 "register_operand" "v")]) - (match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vconduv4si" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (if_then_else:V4SI - (match_operator 3 "comparison_operator" - [(match_operand:V4SI 4 "register_operand" "v") - (match_operand:V4SI 5 "register_operand" "v")]) - (match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vcondv4sf" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (if_then_else:V4SF - (match_operator 3 "comparison_operator" - [(match_operand:V4SF 4 "register_operand" "v") - (match_operand:V4SF 5 "register_operand" "v")]) - (match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vcondv8hi" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (if_then_else:V8HI - (match_operator 3 "comparison_operator" - [(match_operand:V8HI 4 "register_operand" "v") - (match_operand:V8HI 5 "register_operand" "v")]) - (match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vconduv8hi" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (if_then_else:V8HI - (match_operator 3 "comparison_operator" - [(match_operand:V8HI 4 "register_operand" "v") - (match_operand:V8HI 5 "register_operand" "v")]) - (match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vcondv16qi" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (if_then_else:V16QI - (match_operator 3 "comparison_operator" - [(match_operand:V16QI 4 "register_operand" "v") - (match_operand:V16QI 5 "register_operand" "v")]) - (match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vconduv16qi" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (if_then_else:V16QI - (match_operator 3 "comparison_operator" - [(match_operand:V16QI 4 "register_operand" "v") - (match_operand:V16QI 5 "register_operand" "v")]) - (match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - - -(define_insn "altivec_vsel_v4si" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v") - (match_operand:V4SI 3 "register_operand" "v")] - UNSPEC_VSEL4SI))] - "TARGET_ALTIVEC" - "vsel %0,%1,%2,%3" - [(set_attr "type" "vecperm")]) +(define_expand "altivec_copysign_v4sf3" + [(use (match_operand:V4SF 0 "register_operand" "")) + (use (match_operand:V4SF 1 "register_operand" "")) + (use (match_operand:V4SF 2 "register_operand" ""))] + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" + " +{ + rtx mask = gen_reg_rtx (V4SImode); + rtvec v = rtvec_alloc (4); + unsigned HOST_WIDE_INT mask_val = ((unsigned HOST_WIDE_INT)1) << 31; -(define_insn "altivec_vsel_v4sf" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v") - (match_operand:V4SI 3 "register_operand" "v")] - UNSPEC_VSEL4SF))] - "TARGET_ALTIVEC" - "vsel %0,%1,%2,%3" - [(set_attr "type" "vecperm")]) + RTVEC_ELT (v, 0) = GEN_INT (mask_val); + RTVEC_ELT (v, 1) = GEN_INT (mask_val); + RTVEC_ELT (v, 2) = GEN_INT (mask_val); + RTVEC_ELT (v, 3) = GEN_INT (mask_val); -(define_insn "altivec_vsel_v8hi" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v") - (match_operand:V8HI 3 "register_operand" "v")] - UNSPEC_VSEL8HI))] - "TARGET_ALTIVEC" - "vsel %0,%1,%2,%3" - [(set_attr "type" "vecperm")]) - -(define_insn "altivec_vsel_v16qi" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v") - (match_operand:V16QI 3 "register_operand" "v")] - UNSPEC_VSEL16QI))] - "TARGET_ALTIVEC" - "vsel %0,%1,%2,%3" - [(set_attr "type" "vecperm")]) + emit_insn (gen_vec_initv4si (mask, gen_rtx_PARALLEL (V4SImode, v))); + emit_insn (gen_vector_select_v4sf (operands[0], operands[1], operands[2], + gen_lowpart (V4SFmode, mask))); + DONE; +}") (define_insn "altivec_vsldoi_<mode>" - [(set (match_operand:V 0 "register_operand" "=v") - (unspec:V [(match_operand:V 1 "register_operand" "v") - (match_operand:V 2 "register_operand" "v") - (match_operand:QI 3 "immediate_operand" "i")] + [(set (match_operand:VM 0 "register_operand" "=v") + (unspec:VM [(match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v") + (match_operand:QI 3 "immediate_operand" "i")] UNSPEC_VLSDOI))] "TARGET_ALTIVEC" "vsldoi %0,%1,%2,%3" @@ -1725,50 +1542,92 @@ "vupklsh %0,%1" [(set_attr "type" "vecperm")]) -;; AltiVec predicates. +;; Compare vectors producing a vector result and a predicate, setting CR6 to +;; indicate a combined status +(define_insn "*altivec_vcmpequ<VI_char>_p" + [(set (reg:CC 74) + (unspec:CC [(eq:CC (match_operand:VI 1 "register_operand" "v") + (match_operand:VI 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:VI 0 "register_operand" "=v") + (eq:VI (match_dup 1) + (match_dup 2)))] + "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)" + "vcmpequ<VI_char>. %0,%1,%2" + [(set_attr "type" "veccmp")]) -(define_expand "cr6_test_for_zero" - [(set (match_operand:SI 0 "register_operand" "=r") - (eq:SI (reg:CC 74) - (const_int 0)))] - "TARGET_ALTIVEC" - "") +(define_insn "*altivec_vcmpgts<VI_char>_p" + [(set (reg:CC 74) + (unspec:CC [(gt:CC (match_operand:VI 1 "register_operand" "v") + (match_operand:VI 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:VI 0 "register_operand" "=v") + (gt:VI (match_dup 1) + (match_dup 2)))] + "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)" + "vcmpgts<VI_char>. %0,%1,%2" + [(set_attr "type" "veccmp")]) -(define_expand "cr6_test_for_zero_reverse" - [(set (match_operand:SI 0 "register_operand" "=r") - (eq:SI (reg:CC 74) - (const_int 0))) - (set (match_dup 0) (minus:SI (const_int 1) (match_dup 0)))] - "TARGET_ALTIVEC" - "") +(define_insn "*altivec_vcmpgtu<VI_char>_p" + [(set (reg:CC 74) + (unspec:CC [(gtu:CC (match_operand:VI 1 "register_operand" "v") + (match_operand:VI 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:VI 0 "register_operand" "=v") + (gtu:VI (match_dup 1) + (match_dup 2)))] + "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)" + "vcmpgtu<VI_char>. %0,%1,%2" + [(set_attr "type" "veccmp")]) -(define_expand "cr6_test_for_lt" - [(set (match_operand:SI 0 "register_operand" "=r") - (lt:SI (reg:CC 74) - (const_int 0)))] - "TARGET_ALTIVEC" - "") +(define_insn "*altivec_vcmpeqfp_p" + [(set (reg:CC 74) + (unspec:CC [(eq:CC (match_operand:V4SF 1 "register_operand" "v") + (match_operand:V4SF 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V4SF 0 "register_operand" "=v") + (eq:V4SF (match_dup 1) + (match_dup 2)))] + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" + "vcmpeqfp. %0,%1,%2" + [(set_attr "type" "veccmp")]) -(define_expand "cr6_test_for_lt_reverse" - [(set (match_operand:SI 0 "register_operand" "=r") - (lt:SI (reg:CC 74) - (const_int 0))) - (set (match_dup 0) (minus:SI (const_int 1) (match_dup 0)))] - "TARGET_ALTIVEC" - "") +(define_insn "*altivec_vcmpgtfp_p" + [(set (reg:CC 74) + (unspec:CC [(gt:CC (match_operand:V4SF 1 "register_operand" "v") + (match_operand:V4SF 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V4SF 0 "register_operand" "=v") + (gt:V4SF (match_dup 1) + (match_dup 2)))] + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" + "vcmpgtfp. %0,%1,%2" + [(set_attr "type" "veccmp")]) -;; We can get away with generating the opcode on the fly (%3 below) -;; because all the predicates have the same scheduling parameters. +(define_insn "*altivec_vcmpgefp_p" + [(set (reg:CC 74) + (unspec:CC [(ge:CC (match_operand:V4SF 1 "register_operand" "v") + (match_operand:V4SF 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V4SF 0 "register_operand" "=v") + (ge:V4SF (match_dup 1) + (match_dup 2)))] + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" + "vcmpgefp. %0,%1,%2" + [(set_attr "type" "veccmp")]) -(define_insn "altivec_predicate_<mode>" +(define_insn "altivec_vcmpbfp_p" [(set (reg:CC 74) - (unspec:CC [(match_operand:V 1 "register_operand" "v") - (match_operand:V 2 "register_operand" "v") - (match_operand 3 "any_operand" "")] UNSPEC_PREDICATE)) - (clobber (match_scratch:V 0 "=v"))] - "TARGET_ALTIVEC" - "%3 %0,%1,%2" -[(set_attr "type" "veccmp")]) + (unspec:CC [(match_operand:V4SF 1 "register_operand" "v") + (match_operand:V4SF 2 "register_operand" "v")] + UNSPEC_VCMPBFP)) + (set (match_operand:V4SF 0 "register_operand" "=v") + (unspec:V4SF [(match_dup 1) + (match_dup 2)] + UNSPEC_VCMPBFP))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)" + "vcmpbfp. %0,%1,%2" + [(set_attr "type" "veccmp")]) (define_insn "altivec_mtvscr" [(set (reg:SI 110) @@ -1937,95 +1796,6 @@ "stvewx %1,%y0" [(set_attr "type" "vecstore")]) -(define_expand "vec_init<mode>" - [(match_operand:V 0 "register_operand" "") - (match_operand 1 "" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_init (operands[0], operands[1]); - DONE; -}) - -(define_expand "vec_setv4si" - [(match_operand:V4SI 0 "register_operand" "") - (match_operand:SI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_setv8hi" - [(match_operand:V8HI 0 "register_operand" "") - (match_operand:HI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_setv16qi" - [(match_operand:V16QI 0 "register_operand" "") - (match_operand:QI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_setv4sf" - [(match_operand:V4SF 0 "register_operand" "") - (match_operand:SF 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_extractv4si" - [(match_operand:SI 0 "register_operand" "") - (match_operand:V4SI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_extractv8hi" - [(match_operand:HI 0 "register_operand" "") - (match_operand:V8HI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_extractv16qi" - [(match_operand:QI 0 "register_operand" "") - (match_operand:V16QI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_extractv4sf" - [(match_operand:SF 0 "register_operand" "") - (match_operand:V4SF 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - ;; Generate ;; vspltis? SCRATCH0,0 ;; vsubu?m SCRATCH2,SCRATCH1,%1 @@ -2047,7 +1817,7 @@ ;; vspltisw SCRATCH1,-1 ;; vslw SCRATCH2,SCRATCH1,SCRATCH1 ;; vandc %0,%1,SCRATCH2 -(define_expand "absv4sf2" +(define_expand "altivec_absv4sf2" [(set (match_dup 2) (vec_duplicate:V4SI (const_int -1))) (set (match_dup 3) @@ -2080,66 +1850,6 @@ operands[3] = gen_reg_rtx (GET_MODE (operands[0])); }) -;; Vector shift left in bits. Currently supported ony for shift -;; amounts that can be expressed as byte shifts (divisible by 8). -;; General shift amounts can be supported using vslo + vsl. We're -;; not expecting to see these yet (the vectorizer currently -;; generates only shifts divisible by byte_size). -(define_expand "vec_shl_<mode>" - [(set (match_operand:V 0 "register_operand" "=v") - (unspec:V [(match_operand:V 1 "register_operand" "v") - (match_operand:QI 2 "reg_or_short_operand" "")] - UNSPEC_VECSH))] - "TARGET_ALTIVEC" - " -{ - rtx bitshift = operands[2]; - rtx byteshift = gen_reg_rtx (QImode); - HOST_WIDE_INT bitshift_val; - HOST_WIDE_INT byteshift_val; - - if (! CONSTANT_P (bitshift)) - FAIL; - bitshift_val = INTVAL (bitshift); - if (bitshift_val & 0x7) - FAIL; - byteshift_val = bitshift_val >> 3; - byteshift = gen_rtx_CONST_INT (QImode, byteshift_val); - emit_insn (gen_altivec_vsldoi_<mode> (operands[0], operands[1], operands[1], - byteshift)); - DONE; -}") - -;; Vector shift left in bits. Currently supported ony for shift -;; amounts that can be expressed as byte shifts (divisible by 8). -;; General shift amounts can be supported using vsro + vsr. We're -;; not expecting to see these yet (the vectorizer currently -;; generates only shifts divisible by byte_size). -(define_expand "vec_shr_<mode>" - [(set (match_operand:V 0 "register_operand" "=v") - (unspec:V [(match_operand:V 1 "register_operand" "v") - (match_operand:QI 2 "reg_or_short_operand" "")] - UNSPEC_VECSH))] - "TARGET_ALTIVEC" - " -{ - rtx bitshift = operands[2]; - rtx byteshift = gen_reg_rtx (QImode); - HOST_WIDE_INT bitshift_val; - HOST_WIDE_INT byteshift_val; - - if (! CONSTANT_P (bitshift)) - FAIL; - bitshift_val = INTVAL (bitshift); - if (bitshift_val & 0x7) - FAIL; - byteshift_val = 16 - (bitshift_val >> 3); - byteshift = gen_rtx_CONST_INT (QImode, byteshift_val); - emit_insn (gen_altivec_vsldoi_<mode> (operands[0], operands[1], operands[1], - byteshift)); - DONE; -}") - (define_insn "altivec_vsumsws_nomode" [(set (match_operand 0 "register_operand" "=v") (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") @@ -2182,16 +1892,6 @@ DONE; }") -(define_insn "vec_realign_load_<mode>" - [(set (match_operand:V 0 "register_operand" "=v") - (unspec:V [(match_operand:V 1 "register_operand" "v") - (match_operand:V 2 "register_operand" "v") - (match_operand:V16QI 3 "register_operand" "v")] - UNSPEC_REALIGN_LOAD))] - "TARGET_ALTIVEC" - "vperm %0,%1,%2,%3" - [(set_attr "type" "vecperm")]) - (define_expand "neg<mode>2" [(use (match_operand:VI 0 "register_operand" "")) (use (match_operand:VI 1 "register_operand" ""))] @@ -2643,7 +2343,7 @@ DONE; }") -(define_expand "negv4sf2" +(define_expand "altivec_negv4sf2" [(use (match_operand:V4SF 0 "register_operand" "")) (use (match_operand:V4SF 1 "register_operand" ""))] "TARGET_ALTIVEC" @@ -2972,29 +2672,6 @@ emit_insn (gen_vpkuhum_nomode (operands[0], operands[1], operands[2])); DONE; }") -(define_expand "vec_interleave_highv4sf" - [(set (match_operand:V4SF 0 "register_operand" "") - (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "") - (match_operand:V4SF 2 "register_operand" "")] - UNSPEC_INTERHI_V4SF))] - "TARGET_ALTIVEC" - " -{ - emit_insn (gen_altivec_vmrghsf (operands[0], operands[1], operands[2])); - DONE; -}") - -(define_expand "vec_interleave_lowv4sf" - [(set (match_operand:V4SF 0 "register_operand" "") - (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "") - (match_operand:V4SF 2 "register_operand" "")] - UNSPEC_INTERLO_V4SF))] - "TARGET_ALTIVEC" - " -{ - emit_insn (gen_altivec_vmrglsf (operands[0], operands[1], operands[2])); - DONE; -}") (define_expand "vec_interleave_high<mode>" [(set (match_operand:VI 0 "register_operand" "") Index: gcc-4.3.4-20091019/gcc/config/rs6000/constraints.md =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/constraints.md 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/constraints.md 2009-10-19 13:40:37.000000000 +0200 @@ -17,10 +17,14 @@ ;; along with GCC; see the file COPYING3. If not see ;; <http://www.gnu.org/licenses/>. +;; Available constraint letters: "e", "k", "u", "A", "B", "C", "D" + ;; Register constraints -(define_register_constraint "f" "TARGET_HARD_FLOAT && TARGET_FPRS - ? FLOAT_REGS : NO_REGS" +(define_register_constraint "f" "rs6000_constraints[RS6000_CONSTRAINT_f]" + "@internal") + +(define_register_constraint "d" "rs6000_constraints[RS6000_CONSTRAINT_d]" "@internal") (define_register_constraint "b" "BASE_REGS" @@ -50,6 +54,28 @@ (define_register_constraint "z" "XER_REGS" "@internal") +;; Use w as a prefix to add VSX modes +;; vector double (V2DF) +(define_register_constraint "wd" "rs6000_constraints[RS6000_CONSTRAINT_wd]" + "@internal") + +;; vector float (V4SF) +(define_register_constraint "wf" "rs6000_constraints[RS6000_CONSTRAINT_wf]" + "@internal") + +;; scalar double (DF) +(define_register_constraint "ws" "rs6000_constraints[RS6000_CONSTRAINT_ws]" + "@internal") + +;; any VSX register +(define_register_constraint "wa" "rs6000_constraints[RS6000_CONSTRAINT_wa]" + "@internal") + +;; Altivec style load/store that ignores the bottom bits of the address +(define_memory_constraint "wZ" + "Indexed or indirect memory operand, ignoring the bottom 4 bits" + (match_operand 0 "altivec_indexed_or_indirect_operand")) + ;; Integer constraints (define_constraint "I" @@ -109,8 +135,17 @@ ;; Memory constraints +(define_memory_constraint "es" + "A ``stable'' memory operand; that is, one which does not include any +automodification of the base register. Unlike @samp{m}, this constraint +can be used in @code{asm} statements that might access the operand +several times, or that might not access it at all." + (and (match_code "mem") + (match_test "GET_RTX_CLASS (GET_CODE (XEXP (op, 0))) != RTX_AUTOINC"))) + (define_memory_constraint "Q" - "Memory operand that is just an offset from a reg" + "Memory operand that is an offset from a register (it is usually better +to use @samp{m} or @samp{es} in @code{asm} statements)" (and (match_code "mem") (match_test "GET_CODE (XEXP (op, 0)) == REG"))) @@ -119,7 +154,8 @@ (match_operand 0 "word_offset_memref_operand")) (define_memory_constraint "Z" - "Indexed or indirect memory operand" + "Memory operand that is an indexed or indirect from a register (it is +usually better to use @samp{m} or @samp{es} in @code{asm} statements)" (match_operand 0 "indexed_or_indirect_operand")) ;; Address constraints @@ -159,3 +195,7 @@ (define_constraint "W" "vector constant that does not require memory" (match_operand 0 "easy_vector_constant")) + +(define_constraint "j" + "Zero vector constant" + (match_test "(op == const0_rtx || op == CONST0_RTX (GET_MODE (op)))")) Index: gcc-4.3.4-20091019/gcc/config/rs6000/darwin.h =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/darwin.h 2008-02-28 18:10:07.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/darwin.h 2009-10-19 13:40:37.000000000 +0200 @@ -1,5 +1,5 @@ /* Target definitions for PowerPC running Darwin (Mac OS X). - Copyright (C) 1997, 2000, 2001, 2003, 2004, 2005, 2006, 2007 + Copyright (C) 1997, 2000, 2001, 2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. Contributed by Apple Computer Inc. @@ -191,6 +191,8 @@ #undef FP_SAVE_INLINE #define FP_SAVE_INLINE(FIRST_REG) ((FIRST_REG) < 64) +#undef GP_SAVE_INLINE +#define GP_SAVE_INLINE(FIRST_REG) ((FIRST_REG) < 32) /* Darwin uses a function call if everything needs to be saved/restored. */ #undef WORLD_SAVE_P Index: gcc-4.3.4-20091019/gcc/config/rs6000/darwin-ldouble.c =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/darwin-ldouble.c 2008-02-19 10:55:52.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/darwin-ldouble.c 2009-10-19 13:40:37.000000000 +0200 @@ -92,18 +92,39 @@ __asm__ (".symver __gcc_qadd,_xlqadd@GCC ".symver .__gcc_qdiv,._xlqdiv@GCC_3.4"); #endif -typedef union +/* Avoid the horrible code gcc generates when using a union to turn + a pair of doubles into a long double. All PowerPC ABIs using this + file return a long double in fr1/fr2, except of course when using + soft-float. */ +static inline long double +pack (double a, double aa) { - long double ldval; - double dval[2]; -} longDblUnion; +#ifdef __NO_FPRS__ + union + { + long double ldval; + double dval[2]; + } x; + + x.dval[0] = a; + x.dval[1] = aa; + return x.ldval; +#else + register double hi __asm__ ("fr1"); + register double lo __asm__ ("fr2"); + register long double ld __asm__ ("fr1"); + hi = a; + lo = aa; + __asm__ ("" : "=f" (ld) : "f" (hi), "f" (lo)); + return ld; +#endif +} /* Add two 'long double' values and return the result. */ long double __gcc_qadd (double a, double aa, double c, double cc) { - longDblUnion x; - double z, q, zz, xh; + double xh, xl, z, q, zz; z = a + c; @@ -112,12 +133,12 @@ __gcc_qadd (double a, double aa, double z = cc + aa + c + a; if (nonfinite (z)) return z; - x.dval[0] = z; /* Will always be DBL_MAX. */ + xh = z; /* Will always be DBL_MAX. */ zz = aa + cc; if (fabs(a) > fabs(c)) - x.dval[1] = a - z + c + zz; + xl = a - z + c + zz; else - x.dval[1] = c - z + a + zz; + xl = c - z + a + zz; } else { @@ -132,10 +153,9 @@ __gcc_qadd (double a, double aa, double if (nonfinite (xh)) return xh; - x.dval[0] = xh; - x.dval[1] = z - xh + zz; + xl = z - xh + zz; } - return x.ldval; + return pack (xh, xl); } long double @@ -151,8 +171,7 @@ static double fmsub (double, double, dou long double __gcc_qmul (double a, double b, double c, double d) { - longDblUnion z; - double t, tau, u, v, w; + double xh, xl, t, tau, u, v, w; t = a * c; /* Highest order double term. */ @@ -176,16 +195,15 @@ __gcc_qmul (double a, double b, double c /* Construct long double result. */ if (nonfinite (u)) return u; - z.dval[0] = u; - z.dval[1] = (t - u) + tau; - return z.ldval; + xh = u; + xl = (t - u) + tau; + return pack (xh, xl); } long double __gcc_qdiv (double a, double b, double c, double d) { - longDblUnion z; - double s, sigma, t, tau, u, v, w; + double xh, xl, s, sigma, t, tau, u, v, w; t = a / c; /* highest order double term */ @@ -213,9 +231,9 @@ __gcc_qdiv (double a, double b, double c /* Construct long double result. */ if (nonfinite (u)) return u; - z.dval[0] = u; - z.dval[1] = (t - u) + tau; - return z.ldval; + xh = u; + xl = (t - u) + tau; + return pack (xh, xl); } #if defined (_SOFT_DOUBLE) && defined (__LONG_DOUBLE_128__) @@ -242,11 +260,7 @@ extern int __gedf2 (double, double); long double __gcc_qneg (double a, double aa) { - longDblUnion x; - - x.dval[0] = -a; - x.dval[1] = -aa; - return x.ldval; + return pack (-a, -aa); } /* Compare two 'long double' values for equality. */ @@ -286,24 +300,14 @@ strong_alias (__gcc_qge, __gcc_qgt); long double __gcc_stoq (float a) { - longDblUnion x; - - x.dval[0] = (double) a; - x.dval[1] = 0.0; - - return x.ldval; + return pack ((double) a, 0.0); } /* Convert double to long double. */ long double __gcc_dtoq (double a) { - longDblUnion x; - - x.dval[0] = a; - x.dval[1] = 0.0; - - return x.ldval; + return pack (a, 0.0); } /* Convert long double to single. */ Index: gcc-4.3.4-20091019/gcc/config/rs6000/driver-rs6000.c =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/driver-rs6000.c 2008-08-19 17:30:47.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/config/rs6000/driver-rs6000.c 2009-10-19 13:40:37.000000000 +0200 @@ -343,47 +343,156 @@ detect_processor_aix (void) #endif /* _AIX */ +/* + * Array to map -mcpu=native names to the switches passed to the assembler. + * This list mirrors the specs in ASM_CPU_SPEC, and any changes made here + * should be made there as well. + */ + +struct asm_name { + const char *cpu; + const char *asm_sw; +}; + +static const struct asm_name asm_names[] = { +#if defined (_AIX) + { "power3", "-m620" }, + { "power4", "-mpwr4" }, + { "power5", "-mpwr5" }, + { "power5+", "-mpwr5x" }, + { "power6", "-mpwr6" }, + { "power6x", "-mpwr6" }, + { "power7", "-mpwr7" }, + { "powerpc", "-mppc" }, + { "rs64a", "-mppc" }, + { "603", "-m603" }, + { "603e", "-m603" }, + { "604", "-m604" }, + { "604e", "-m604" }, + { "620", "-m620" }, + { "630", "-m620" }, + { "970", "-m970" }, + { "G5", "-m970" }, + { NULL, "\ +%{!maix64: \ +%{mpowerpc64: -mppc64} \ +%{maltivec: -m970} \ +%{!maltivec: %{!mpower64: %(asm_default)}}}" }, + +#else + { "common", "-mcom" }, + { "cell", "-mcell" }, + { "power", "-mpwr" }, + { "power2", "-mpwrx" }, + { "power3", "-mppc64" }, + { "power4", "-mpower4" }, + { "power5", "%(asm_cpu_power5)" }, + { "power5+", "%(asm_cpu_power5)" }, + { "power6", "%(asm_cpu_power6) -maltivec" }, + { "power6x", "%(asm_cpu_power6) -maltivec" }, + { "power7", "%(asm_cpu_power7)" }, + { "powerpc", "-mppc" }, + { "rios", "-mpwr" }, + { "rios1", "-mpwr" }, + { "rios2", "-mpwrx" }, + { "rsc", "-mpwr" }, + { "rsc1", "-mpwr" }, + { "rs64a", "-mppc64" }, + { "401", "-mppc" }, + { "403", "-m403" }, + { "405", "-m405" }, + { "405fp", "-m405" }, + { "440", "-m440" }, + { "440fp", "-m440" }, + { "464", "-m440" }, + { "464fp", "-m440" }, + { "505", "-mppc" }, + { "601", "-m601" }, + { "602", "-mppc" }, + { "603", "-mppc" }, + { "603e", "-mppc" }, + { "ec603e", "-mppc" }, + { "604", "-mppc" }, + { "604e", "-mppc" }, + { "620", "-mppc64" }, + { "630", "-mppc64" }, + { "740", "-mppc" }, + { "750", "-mppc" }, + { "G3", "-mppc" }, + { "7400", "-mppc -maltivec" }, + { "7450", "-mppc -maltivec" }, + { "G4", "-mppc -maltivec" }, + { "801", "-mppc" }, + { "821", "-mppc" }, + { "823", "-mppc" }, + { "860", "-mppc" }, + { "970", "-mpower4 -maltivec" }, + { "G5", "-mpower4 -maltivec" }, + { "8540", "-me500" }, + { "8548", "-me500" }, + { "e300c2", "-me300" }, + { "e300c3", "-me300" }, + { "e500mc", "-me500mc" }, + { NULL, "\ +%{mpower: %{!mpower2: -mpwr}} \ +%{mpower2: -mpwrx} \ +%{mpowerpc64*: -mppc64} \ +%{!mpowerpc64*: %{mpowerpc*: -mppc}} \ +%{mno-power: %{!mpowerpc*: -mcom}} \ +%{!mno-power: %{!mpower*: %(asm_default)}}" }, +#endif +}; + /* This will be called by the spec parser in gcc.c when it sees a %:local_cpu_detect(args) construct. Currently it will be called with either "arch" or "tune" as argument depending on if -march=native or -mtune=native is to be substituted. + Additionally it will be called with "asm" to select the appropriate flags + for the assembler. + It returns a string containing new command line parameters to be put at the place of the above two options, depending on what CPU this is executed. ARGC and ARGV are set depending on the actual arguments given in the spec. */ -const char -*host_detect_local_cpu (int argc, const char **argv) +const char * +host_detect_local_cpu (int argc, const char **argv) { const char *cpu = NULL; const char *cache = ""; const char *options = ""; bool arch; + bool assembler; + size_t i; if (argc < 1) return NULL; arch = strcmp (argv[0], "cpu") == 0; - if (!arch && strcmp (argv[0], "tune")) + assembler = (!arch && strcmp (argv[0], "asm") == 0); + if (!arch && !assembler && strcmp (argv[0], "tune")) return NULL; + if (! assembler) + { #if defined (_AIX) - cache = detect_caches_aix (); + cache = detect_caches_aix (); #elif defined (__APPLE__) - cache = detect_caches_darwin (); + cache = detect_caches_darwin (); #elif defined (__FreeBSD__) - cache = detect_caches_freebsd (); - /* FreeBSD PPC does not provide any cache information yet. */ - cache = ""; + cache = detect_caches_freebsd (); + /* FreeBSD PPC does not provide any cache information yet. */ + cache = ""; #elif defined (__linux__) - cache = detect_caches_linux (); - /* PPC Linux does not provide any cache information yet. */ - cache = ""; + cache = detect_caches_linux (); + /* PPC Linux does not provide any cache information yet. */ + cache = ""; #else - cache = ""; + cache = ""; #endif + } #if defined (_AIX) cpu = detect_processor_aix (); @@ -397,6 +506,17 @@ const char cpu = "powerpc"; #endif + if (assembler) + { + for (i = 0; i < sizeof (asm_names) / sizeof (asm_names[0]); i++) + { + if (!asm_names[i].cpu || !strcmp (asm_names[i].cpu, cpu)) + return asm_names[i].asm_sw; + } + + return NULL; + } + return concat (cache, "-m", argv[0], "=", cpu, " ", options, NULL); } @@ -404,7 +524,8 @@ const char /* If we aren't compiling with GCC we just provide a minimal default value. */ -const char *host_detect_local_cpu (int argc, const char **argv) +const char * +host_detect_local_cpu (int argc, const char **argv) { const char *cpu; bool arch; Index: gcc-4.3.4-20091019/gcc/config/rs6000/e300c2c3.md =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ gcc-4.3.4-20091019/gcc/config/rs6000/e300c2c3.md 2009-10-19 13:40:37.000000000 +0200 @@ -0,0 +1,189 @@ +;; Pipeline description for Motorola PowerPC e300c3 core. +;; Copyright (C) 2008 Free Software Foundation, Inc. +;; Contributed by Edmar Wienskoski (edmar@freescale.com) +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published +;; by the Free Software Foundation; either version 3, or (at your +;; option) any later version. +;; +;; GCC is distributed in the hope that it will be useful, but WITHOUT +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY +;; or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public +;; License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; <http://www.gnu.org/licenses/>. + +(define_automaton "ppce300c3_most,ppce300c3_long,ppce300c3_retire") +(define_cpu_unit "ppce300c3_decode_0,ppce300c3_decode_1" "ppce300c3_most") + +;; We don't simulate general issue queue (GIC). If we have SU insn +;; and then SU1 insn, they can not be issued on the same cycle +;; (although SU1 insn and then SU insn can be issued) because the SU +;; insn will go to SU1 from GIC0 entry. Fortunately, the first cycle +;; multipass insn scheduling will find the situation and issue the SU1 +;; insn and then the SU insn. +(define_cpu_unit "ppce300c3_issue_0,ppce300c3_issue_1" "ppce300c3_most") + +;; We could describe completion buffers slots in combination with the +;; retirement units and the order of completion but the result +;; automaton would behave in the same way because we can not describe +;; real latency time with taking in order completion into account. +;; Actually we could define the real latency time by querying reserved +;; automaton units but the current scheduler uses latency time before +;; issuing insns and making any reservations. +;; +;; So our description is aimed to achieve a insn schedule in which the +;; insns would not wait in the completion buffer. +(define_cpu_unit "ppce300c3_retire_0,ppce300c3_retire_1" "ppce300c3_retire") + +;; Branch unit: +(define_cpu_unit "ppce300c3_bu" "ppce300c3_most") + +;; IU: +(define_cpu_unit "ppce300c3_iu0_stage0,ppce300c3_iu1_stage0" "ppce300c3_most") + +;; IU: This used to describe non-pipelined division. +(define_cpu_unit "ppce300c3_mu_div" "ppce300c3_long") + +;; SRU: +(define_cpu_unit "ppce300c3_sru_stage0" "ppce300c3_most") + +;; Here we simplified LSU unit description not describing the stages. +(define_cpu_unit "ppce300c3_lsu" "ppce300c3_most") + +;; FPU: +(define_cpu_unit "ppce300c3_fpu" "ppce300c3_most") + +;; The following units are used to make automata deterministic +(define_cpu_unit "present_ppce300c3_decode_0" "ppce300c3_most") +(define_cpu_unit "present_ppce300c3_issue_0" "ppce300c3_most") +(define_cpu_unit "present_ppce300c3_retire_0" "ppce300c3_retire") +(define_cpu_unit "present_ppce300c3_iu0_stage0" "ppce300c3_most") + +;; The following sets to make automata deterministic when option ndfa is used. +(presence_set "present_ppce300c3_decode_0" "ppce300c3_decode_0") +(presence_set "present_ppce300c3_issue_0" "ppce300c3_issue_0") +(presence_set "present_ppce300c3_retire_0" "ppce300c3_retire_0") +(presence_set "present_ppce300c3_iu0_stage0" "ppce300c3_iu0_stage0") + +;; Some useful abbreviations. +(define_reservation "ppce300c3_decode" + "ppce300c3_decode_0|ppce300c3_decode_1+present_ppce300c3_decode_0") +(define_reservation "ppce300c3_issue" + "ppce300c3_issue_0|ppce300c3_issue_1+present_ppce300c3_issue_0") +(define_reservation "ppce300c3_retire" + "ppce300c3_retire_0|ppce300c3_retire_1+present_ppce300c3_retire_0") +(define_reservation "ppce300c3_iu_stage0" + "ppce300c3_iu0_stage0|ppce300c3_iu1_stage0+present_ppce300c3_iu0_stage0") + +;; Compares can be executed either one of the IU or SRU +(define_insn_reservation "ppce300c3_cmp" 1 + (and (eq_attr "type" "cmp,compare,delayed_compare,fast_compare") + (ior (eq_attr "cpu" "ppce300c2") (eq_attr "cpu" "ppce300c3"))) + "ppce300c3_decode,ppce300c3_issue+(ppce300c3_iu_stage0|ppce300c3_sru_stage0) \ + +ppce300c3_retire") + +;; Other one cycle IU insns +(define_insn_reservation "ppce300c3_iu" 1 + (and (eq_attr "type" "integer,insert_word") + (ior (eq_attr "cpu" "ppce300c2") (eq_attr "cpu" "ppce300c3"))) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_iu_stage0+ppce300c3_retire") + +;; Branch. Actually this latency time is not used by the scheduler. +(define_insn_reservation "ppce300c3_branch" 1 + (and (eq_attr "type" "jmpreg,branch") + (ior (eq_attr "cpu" "ppce300c2") (eq_attr "cpu" "ppce300c3"))) + "ppce300c3_decode,ppce300c3_bu,ppce300c3_retire") + +;; Multiply is non-pipelined but can be executed in any IU +(define_insn_reservation "ppce300c3_multiply" 2 + (and (eq_attr "type" "imul,imul2,imul3,imul_compare") + (ior (eq_attr "cpu" "ppce300c2") (eq_attr "cpu" "ppce300c3"))) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_iu_stage0, \ + ppce300c3_iu_stage0+ppce300c3_retire") + +;; Divide. We use the average latency time here. We omit reserving a +;; retire unit because of the result automata will be huge. +(define_insn_reservation "ppce300c3_divide" 20 + (and (eq_attr "type" "idiv") + (ior (eq_attr "cpu" "ppce300c2") (eq_attr "cpu" "ppce300c3"))) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_iu_stage0+ppce300c3_mu_div,\ + ppce300c3_mu_div*19") + +;; CR logical +(define_insn_reservation "ppce300c3_cr_logical" 1 + (and (eq_attr "type" "cr_logical,delayed_cr") + (ior (eq_attr "cpu" "ppce300c2") (eq_attr "cpu" "ppce300c3"))) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_sru_stage0+ppce300c3_retire") + +;; Mfcr +(define_insn_reservation "ppce300c3_mfcr" 1 + (and (eq_attr "type" "mfcr") + (ior (eq_attr "cpu" "ppce300c2") (eq_attr "cpu" "ppce300c3"))) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_sru_stage0+ppce300c3_retire") + +;; Mtcrf +(define_insn_reservation "ppce300c3_mtcrf" 1 + (and (eq_attr "type" "mtcr") + (ior (eq_attr "cpu" "ppce300c2") (eq_attr "cpu" "ppce300c3"))) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_sru_stage0+ppce300c3_retire") + +;; Mtjmpr +(define_insn_reservation "ppce300c3_mtjmpr" 1 + (and (eq_attr "type" "mtjmpr,mfjmpr") + (ior (eq_attr "cpu" "ppce300c2") (eq_attr "cpu" "ppce300c3"))) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_sru_stage0+ppce300c3_retire") + +;; Float point instructions +(define_insn_reservation "ppce300c3_fpcompare" 3 + (and (eq_attr "type" "fpcompare") + (eq_attr "cpu" "ppce300c3")) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_fpu,nothing,ppce300c3_retire") + +(define_insn_reservation "ppce300c3_fp" 3 + (and (eq_attr "type" "fp") + (eq_attr "cpu" "ppce300c3")) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_fpu,nothing,ppce300c3_retire") + +(define_insn_reservation "ppce300c3_dmul" 4 + (and (eq_attr "type" "dmul") + (eq_attr "cpu" "ppce300c3")) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_fpu,ppce300c3_fpu,nothing,ppce300c3_retire") + +; Divides are not pipelined +(define_insn_reservation "ppce300c3_sdiv" 18 + (and (eq_attr "type" "sdiv") + (eq_attr "cpu" "ppce300c3")) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_fpu,ppce300c3_fpu*17") + +(define_insn_reservation "ppce300c3_ddiv" 33 + (and (eq_attr "type" "ddiv") + (eq_attr "cpu" "ppce300c3")) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_fpu,ppce300c3_fpu*32") + +;; Loads +(define_insn_reservation "ppce300c3_load" 2 + (and (eq_attr "type" "load,load_ext,load_ext_u,load_ext_ux,load_ux,load_u") + (ior (eq_attr "cpu" "ppce300c2") (eq_attr "cpu" "ppce300c3"))) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_lsu,ppce300c3_retire") + +(define_insn_reservation "ppce300c3_fpload" 2 + (and (eq_attr "type" "fpload,fpload_ux,fpload_u") + (eq_attr "cpu" "ppce300c3")) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_lsu,ppce300c3_retire") + +;; Stores. +(define_insn_reservation "ppce300c3_store" 2 + (and (eq_attr "type" "store,store_ux,store_u") + (ior (eq_attr "cpu" "ppce300c2") (eq_attr "cpu" "ppce300c3"))) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_lsu,ppce300c3_retire") + +(define_insn_reservation "ppce300c3_fpstore" 2 + (and (eq_attr "type" "fpstore,fpstore_ux,fpstore_u") + (eq_attr "cpu" "ppce300c3")) + "ppce300c3_decode,ppce300c3_issue+ppce300c3_lsu,ppce300c3_retire") Index: gcc-4.3.4-20091019/gcc/config/rs6000/e500mc.md =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ gcc-4.3.4-20091019/gcc/config/rs6000/e500mc.md 2009-10-19 13:40:37.000000000 +0200 @@ -0,0 +1,200 @@ +;; Pipeline description for Motorola PowerPC e500mc core. +;; Copyright (C) 2008 Free Software Foundation, Inc. +;; Contributed by Edmar Wienskoski (edmar@freescale.com) +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published +;; by the Free Software Foundation; either version 3, or (at your +;; option) any later version. +;; +;; GCC is distributed in the hope that it will be useful, but WITHOUT +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY +;; or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public +;; License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; <http://www.gnu.org/licenses/>. +;; +;; e500mc 32-bit SU(2), LSU, FPU, BPU +;; Max issue 3 insns/clock cycle (includes 1 branch) +;; FP is half clocked, timings of other instructions are as in the e500v2. + +(define_automaton "e500mc_most,e500mc_long,e500mc_retire") +(define_cpu_unit "e500mc_decode_0,e500mc_decode_1" "e500mc_most") +(define_cpu_unit "e500mc_issue_0,e500mc_issue_1" "e500mc_most") +(define_cpu_unit "e500mc_retire_0,e500mc_retire_1" "e500mc_retire") + +;; SU. +(define_cpu_unit "e500mc_su0_stage0,e500mc_su1_stage0" "e500mc_most") + +;; MU. +(define_cpu_unit "e500mc_mu_stage0,e500mc_mu_stage1" "e500mc_most") +(define_cpu_unit "e500mc_mu_stage2,e500mc_mu_stage3" "e500mc_most") + +;; Non-pipelined division. +(define_cpu_unit "e500mc_mu_div" "e500mc_long") + +;; LSU. +(define_cpu_unit "e500mc_lsu" "e500mc_most") + +;; FPU. +(define_cpu_unit "e500mc_fpu" "e500mc_most") + +;; Branch unit. +(define_cpu_unit "e500mc_bu" "e500mc_most") + +;; The following units are used to make the automata deterministic. +(define_cpu_unit "present_e500mc_decode_0" "e500mc_most") +(define_cpu_unit "present_e500mc_issue_0" "e500mc_most") +(define_cpu_unit "present_e500mc_retire_0" "e500mc_retire") +(define_cpu_unit "present_e500mc_su0_stage0" "e500mc_most") + +;; The following sets to make automata deterministic when option ndfa is used. +(presence_set "present_e500mc_decode_0" "e500mc_decode_0") +(presence_set "present_e500mc_issue_0" "e500mc_issue_0") +(presence_set "present_e500mc_retire_0" "e500mc_retire_0") +(presence_set "present_e500mc_su0_stage0" "e500mc_su0_stage0") + +;; Some useful abbreviations. +(define_reservation "e500mc_decode" + "e500mc_decode_0|e500mc_decode_1+present_e500mc_decode_0") +(define_reservation "e500mc_issue" + "e500mc_issue_0|e500mc_issue_1+present_e500mc_issue_0") +(define_reservation "e500mc_retire" + "e500mc_retire_0|e500mc_retire_1+present_e500mc_retire_0") +(define_reservation "e500mc_su_stage0" + "e500mc_su0_stage0|e500mc_su1_stage0+present_e500mc_su0_stage0") + +;; Simple SU insns. +(define_insn_reservation "e500mc_su" 1 + (and (eq_attr "type" "integer,insert_word,insert_dword,cmp,compare,\ + delayed_compare,var_delayed_compare,fast_compare,\ + shift,trap,var_shift_rotate,cntlz,exts") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_su_stage0+e500mc_retire") + +(define_insn_reservation "e500mc_two" 1 + (and (eq_attr "type" "two") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_su_stage0+e500mc_retire,\ + e500mc_issue+e500mc_su_stage0+e500mc_retire") + +(define_insn_reservation "e500mc_three" 1 + (and (eq_attr "type" "three") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_su_stage0+e500mc_retire,\ + e500mc_issue+e500mc_su_stage0+e500mc_retire,\ + e500mc_issue+e500mc_su_stage0+e500mc_retire") + +;; Multiply. +(define_insn_reservation "e500mc_multiply" 4 + (and (eq_attr "type" "imul,imul2,imul3,imul_compare") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_mu_stage0,e500mc_mu_stage1,\ + e500mc_mu_stage2,e500mc_mu_stage3+e500mc_retire") + +;; Divide. We use the average latency time here. +(define_insn_reservation "e500mc_divide" 14 + (and (eq_attr "type" "idiv") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_mu_stage0+e500mc_mu_div,\ + e500mc_mu_div*13") + +;; Branch. +(define_insn_reservation "e500mc_branch" 1 + (and (eq_attr "type" "jmpreg,branch,isync") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_bu,e500mc_retire") + +;; CR logical. +(define_insn_reservation "e500mc_cr_logical" 1 + (and (eq_attr "type" "cr_logical,delayed_cr") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_bu,e500mc_retire") + +;; Mfcr. +(define_insn_reservation "e500mc_mfcr" 1 + (and (eq_attr "type" "mfcr") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_su1_stage0+e500mc_retire") + +;; Mtcrf. +(define_insn_reservation "e500mc_mtcrf" 1 + (and (eq_attr "type" "mtcr") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_su1_stage0+e500mc_retire") + +;; Mtjmpr. +(define_insn_reservation "e500mc_mtjmpr" 1 + (and (eq_attr "type" "mtjmpr,mfjmpr") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_su_stage0+e500mc_retire") + +;; Brinc. +(define_insn_reservation "e500mc_brinc" 1 + (and (eq_attr "type" "brinc") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_su_stage0+e500mc_retire") + +;; Loads. +(define_insn_reservation "e500mc_load" 3 + (and (eq_attr "type" "load,load_ext,load_ext_u,load_ext_ux,load_ux,load_u,\ + load_l,sync") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_lsu,nothing,e500mc_retire") + +(define_insn_reservation "e500mc_fpload" 4 + (and (eq_attr "type" "fpload,fpload_ux,fpload_u") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_lsu,nothing*2,e500mc_retire") + +;; Stores. +(define_insn_reservation "e500mc_store" 3 + (and (eq_attr "type" "store,store_ux,store_u,store_c") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_lsu,nothing,e500mc_retire") + +(define_insn_reservation "e500mc_fpstore" 3 + (and (eq_attr "type" "fpstore,fpstore_ux,fpstore_u") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_lsu,nothing,e500mc_retire") + +;; The following ignores the retire unit to avoid a large automata. + +;; Simple FP. +(define_insn_reservation "e500mc_simple_float" 8 + (and (eq_attr "type" "fpsimple") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_fpu") +; "e500mc_decode,e500mc_issue+e500mc_fpu,nothing*6,e500mc_retire") + +;; FP. +(define_insn_reservation "e500mc_float" 8 + (and (eq_attr "type" "fp") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_fpu") +; "e500mc_decode,e500mc_issue+e500mc_fpu,nothing*6,e500mc_retire") + +(define_insn_reservation "e500mc_fpcompare" 8 + (and (eq_attr "type" "fpcompare") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_fpu") + +(define_insn_reservation "e500mc_dmul" 10 + (and (eq_attr "type" "dmul") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_fpu") + +;; FP divides are not pipelined. +(define_insn_reservation "e500mc_sdiv" 36 + (and (eq_attr "type" "sdiv") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_fpu,e500mc_fpu*35") + +(define_insn_reservation "e500mc_ddiv" 66 + (and (eq_attr "type" "ddiv") + (eq_attr "cpu" "ppce500mc")) + "e500mc_decode,e500mc_issue+e500mc_fpu,e500mc_fpu*65") Index: gcc-4.3.4-20091019/gcc/config/rs6000/linux64.h =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/linux64.h 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/linux64.h 2009-10-19 13:40:37.000000000 +0200 @@ -114,7 +114,7 @@ extern int dot_symbols; error (INVALID_32BIT, "32"); \ if (TARGET_PROFILE_KERNEL) \ { \ - target_flags &= ~MASK_PROFILE_KERNEL; \ + SET_PROFILE_KERNEL (0); \ error (INVALID_32BIT, "profile-kernel"); \ } \ } \ Index: gcc-4.3.4-20091019/gcc/config/rs6000/linux64.opt =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/linux64.opt 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/linux64.opt 2009-10-19 13:40:37.000000000 +0200 @@ -20,5 +20,5 @@ ; <http://www.gnu.org/licenses/>. mprofile-kernel -Target Report Mask(PROFILE_KERNEL) +Target Report Var(TARGET_PROFILE_KERNEL) Call mcount for profiling before a function prologue Index: gcc-4.3.4-20091019/gcc/config/rs6000/power4.md =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/power4.md 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/power4.md 2009-10-19 13:40:37.000000000 +0200 @@ -41,21 +41,18 @@ |(du4_power4,lsu1_power4)") (define_reservation "lsuq_power4" - "(du1_power4+du2_power4,lsu1_power4+iu2_power4)\ - |(du2_power4+du3_power4,lsu2_power4+iu2_power4)\ - |(du3_power4+du4_power4,lsu2_power4+iu1_power4)") + "((du1_power4+du2_power4,lsu1_power4)\ + |(du2_power4+du3_power4,lsu2_power4)\ + |(du3_power4+du4_power4,lsu2_power4))\ + +(nothing,iu2_power4|nothing,iu1_power4)") (define_reservation "iq_power4" - "(du1_power4,iu1_power4)\ - |(du2_power4,iu2_power4)\ - |(du3_power4,iu2_power4)\ - |(du4_power4,iu1_power4)") + "(du1_power4|du2_power4|du3_power4|du4_power4),\ + (iu1_power4|iu2_power4)") (define_reservation "fpq_power4" - "(du1_power4,fpu1_power4)\ - |(du2_power4,fpu2_power4)\ - |(du3_power4,fpu2_power4)\ - |(du4_power4,fpu1_power4)") + "(du1_power4|du2_power4|du3_power4|du4_power4),\ + (fpu1_power4|fpu2_power4)") (define_reservation "vq_power4" "(du1_power4,vec_power4)\ @@ -86,9 +83,11 @@ (define_insn_reservation "power4-load-ext" 5 (and (eq_attr "type" "load_ext") (eq_attr "cpu" "power4")) - "(du1_power4+du2_power4,lsu1_power4,nothing,nothing,iu2_power4)\ - |(du2_power4+du3_power4,lsu2_power4,nothing,nothing,iu2_power4)\ - |(du3_power4+du4_power4,lsu2_power4,nothing,nothing,iu1_power4)") + "(du1_power4+du2_power4,lsu1_power4\ + |du2_power4+du3_power4,lsu2_power4\ + |du3_power4+du4_power4,lsu2_power4),\ + nothing,nothing,\ + (iu2_power4|iu1_power4)") (define_insn_reservation "power4-load-ext-update" 5 (and (eq_attr "type" "load_ext_u") @@ -131,18 +130,23 @@ (define_insn_reservation "power4-store" 12 (and (eq_attr "type" "store") (eq_attr "cpu" "power4")) - "(du1_power4,lsu1_power4,iu1_power4)\ - |(du2_power4,lsu2_power4,iu2_power4)\ - |(du3_power4,lsu2_power4,iu2_power4)\ - |(du4_power4,lsu1_power4,iu1_power4)") + "((du1_power4,lsu1_power4)\ + |(du2_power4,lsu2_power4)\ + |(du3_power4,lsu2_power4)\ + |(du4_power4,lsu1_power4)),\ + (iu1_power4|iu2_power4)") (define_insn_reservation "power4-store-update" 12 (and (eq_attr "type" "store_u") (eq_attr "cpu" "power4")) - "(du1_power4+du2_power4,lsu1_power4+iu2_power4,iu1_power4)\ - |(du2_power4+du3_power4,lsu2_power4+iu2_power4,iu2_power4)\ - |(du3_power4+du4_power4,lsu2_power4+iu1_power4,iu2_power4)\ - |(du3_power4+du4_power4,lsu2_power4,iu1_power4,iu2_power4)") + "((du1_power4+du2_power4,lsu1_power4)\ + |(du2_power4+du3_power4,lsu2_power4)\ + |(du3_power4+du4_power4,lsu2_power4)\ + |(du3_power4+du4_power4,lsu2_power4))+\ + ((nothing,iu2_power4,iu1_power4)\ + |(nothing,iu2_power4,iu2_power4)\ + |(nothing,iu1_power4,iu2_power4)\ + |(nothing,iu1_power4,iu2_power4))") (define_insn_reservation "power4-store-update-indexed" 12 (and (eq_attr "type" "store_ux") @@ -153,17 +157,19 @@ (define_insn_reservation "power4-fpstore" 12 (and (eq_attr "type" "fpstore") (eq_attr "cpu" "power4")) - "(du1_power4,lsu1_power4,fpu1_power4)\ - |(du2_power4,lsu2_power4,fpu2_power4)\ - |(du3_power4,lsu2_power4,fpu2_power4)\ - |(du4_power4,lsu1_power4,fpu1_power4)") + "((du1_power4,lsu1_power4)\ + |(du2_power4,lsu2_power4)\ + |(du3_power4,lsu2_power4)\ + |(du4_power4,lsu1_power4)),\ + (fpu1_power4|fpu2_power4)") (define_insn_reservation "power4-fpstore-update" 12 (and (eq_attr "type" "fpstore_u,fpstore_ux") (eq_attr "cpu" "power4")) - "(du1_power4+du2_power4,lsu1_power4+iu2_power4,fpu1_power4)\ - |(du2_power4+du3_power4,lsu2_power4+iu2_power4,fpu2_power4)\ - |(du3_power4+du4_power4,lsu2_power4+iu1_power4,fpu2_power4)") + "((du1_power4+du2_power4,lsu1_power4)\ + |(du2_power4+du3_power4,lsu2_power4)\ + |(du3_power4+du4_power4,lsu2_power4))\ + +(nothing,(iu1_power4|iu2_power4),(fpu1_power4|fpu2_power4))") (define_insn_reservation "power4-vecstore" 12 (and (eq_attr "type" "vecstore") @@ -176,8 +182,7 @@ (define_insn_reservation "power4-llsc" 11 (and (eq_attr "type" "load_l,store_c,sync") (eq_attr "cpu" "power4")) - "du1_power4+du2_power4+du3_power4+du4_power4,\ - lsu1_power4") + "du1_power4+du2_power4+du3_power4+du4_power4,lsu1_power4") ; Integer latency is 2 cycles @@ -190,29 +195,32 @@ (define_insn_reservation "power4-two" 2 (and (eq_attr "type" "two") (eq_attr "cpu" "power4")) - "(du1_power4+du2_power4,iu1_power4,nothing,iu2_power4)\ - |(du2_power4+du3_power4,iu2_power4,nothing,iu2_power4)\ - |(du3_power4+du4_power4,iu2_power4,nothing,iu1_power4)\ - |(du4_power4+du1_power4,iu1_power4,nothing,iu1_power4)") + "((du1_power4+du2_power4)\ + |(du2_power4+du3_power4)\ + |(du3_power4+du4_power4)\ + |(du4_power4+du1_power4)),\ + ((iu1_power4,nothing,iu2_power4)\ + |(iu2_power4,nothing,iu2_power4)\ + |(iu2_power4,nothing,iu1_power4)\ + |(iu1_power4,nothing,iu1_power4))") (define_insn_reservation "power4-three" 2 (and (eq_attr "type" "three") (eq_attr "cpu" "power4")) - "(du1_power4+du2_power4+du3_power4,\ - iu1_power4,nothing,iu2_power4,nothing,iu2_power4)\ - |(du2_power4+du3_power4+du4_power4,\ - iu2_power4,nothing,iu2_power4,nothing,iu1_power4)\ - |(du3_power4+du4_power4+du1_power4,\ - iu2_power4,nothing,iu1_power4,nothing,iu1_power4)\ - |(du4_power4+du1_power4+du2_power4,\ - iu1_power4,nothing,iu2_power4,nothing,iu2_power4)") + "(du1_power4+du2_power4+du3_power4|du2_power4+du3_power4+du4_power4\ + |du3_power4+du4_power4+du1_power4|du4_power4+du1_power4+du2_power4),\ + ((iu1_power4,nothing,iu2_power4,nothing,iu2_power4)\ + |(iu2_power4,nothing,iu2_power4,nothing,iu1_power4)\ + |(iu2_power4,nothing,iu1_power4,nothing,iu1_power4)\ + |(iu1_power4,nothing,iu2_power4,nothing,iu2_power4))") (define_insn_reservation "power4-insert" 4 (and (eq_attr "type" "insert_word") (eq_attr "cpu" "power4")) - "(du1_power4+du2_power4,iu1_power4,nothing,iu2_power4)\ - |(du2_power4+du3_power4,iu2_power4,nothing,iu2_power4)\ - |(du3_power4+du4_power4,iu2_power4,nothing,iu1_power4)") + "(du1_power4+du2_power4|du2_power4+du3_power4|du3_power4+du4_power4),\ + ((iu1_power4,nothing,iu2_power4)\ + |(iu2_power4,nothing,iu2_power4)\ + |(iu2_power4,nothing,iu1_power4))") (define_insn_reservation "power4-cmp" 3 (and (eq_attr "type" "cmp,fast_compare") @@ -222,53 +230,50 @@ (define_insn_reservation "power4-compare" 2 (and (eq_attr "type" "compare,delayed_compare,var_delayed_compare") (eq_attr "cpu" "power4")) - "(du1_power4+du2_power4,iu1_power4,iu2_power4)\ - |(du2_power4+du3_power4,iu2_power4,iu2_power4)\ - |(du3_power4+du4_power4,iu2_power4,iu1_power4)") + "(du1_power4+du2_power4|du2_power4+du3_power4|du3_power4+du4_power4),\ + ((iu1_power4,iu2_power4)\ + |(iu2_power4,iu2_power4)\ + |(iu2_power4,iu1_power4))") (define_bypass 4 "power4-compare" "power4-branch,power4-crlogical,power4-delayedcr,power4-mfcr,power4-mfcrf") (define_insn_reservation "power4-lmul-cmp" 7 (and (eq_attr "type" "lmul_compare") (eq_attr "cpu" "power4")) - "(du1_power4+du2_power4,iu1_power4*6,iu2_power4)\ - |(du2_power4+du3_power4,iu2_power4*6,iu2_power4)\ - |(du3_power4+du4_power4,iu2_power4*6,iu1_power4)") + "(du1_power4+du2_power4|du2_power4+du3_power4|du3_power4+du4_power4),\ + ((iu1_power4*6,iu2_power4)\ + |(iu2_power4*6,iu2_power4)\ + |(iu2_power4*6,iu1_power4))") (define_bypass 10 "power4-lmul-cmp" "power4-branch,power4-crlogical,power4-delayedcr,power4-mfcr,power4-mfcrf") (define_insn_reservation "power4-imul-cmp" 5 (and (eq_attr "type" "imul_compare") (eq_attr "cpu" "power4")) - "(du1_power4+du2_power4,iu1_power4*4,iu2_power4)\ - |(du2_power4+du3_power4,iu2_power4*4,iu2_power4)\ - |(du3_power4+du4_power4,iu2_power4*4,iu1_power4)") + "(du1_power4+du2_power4|du2_power4+du3_power4|du3_power4+du4_power4),\ + ((iu1_power4*4,iu2_power4)\ + |(iu2_power4*4,iu2_power4)\ + |(iu2_power4*4,iu1_power4))") (define_bypass 8 "power4-imul-cmp" "power4-branch,power4-crlogical,power4-delayedcr,power4-mfcr,power4-mfcrf") (define_insn_reservation "power4-lmul" 7 (and (eq_attr "type" "lmul") (eq_attr "cpu" "power4")) - "(du1_power4,iu1_power4*6)\ - |(du2_power4,iu2_power4*6)\ - |(du3_power4,iu2_power4*6)\ - |(du4_power4,iu1_power4*6)") + "(du1_power4|du2_power4|du3_power4|du4_power4),\ + (iu1_power4*6|iu2_power4*6)") (define_insn_reservation "power4-imul" 5 (and (eq_attr "type" "imul") (eq_attr "cpu" "power4")) - "(du1_power4,iu1_power4*4)\ - |(du2_power4,iu2_power4*4)\ - |(du3_power4,iu2_power4*4)\ - |(du4_power4,iu1_power4*4)") + "(du1_power4|du2_power4|du3_power4|du4_power4),\ + (iu1_power4*4|iu2_power4*4)") (define_insn_reservation "power4-imul3" 4 (and (eq_attr "type" "imul2,imul3") (eq_attr "cpu" "power4")) - "(du1_power4,iu1_power4*3)\ - |(du2_power4,iu2_power4*3)\ - |(du3_power4,iu2_power4*3)\ - |(du4_power4,iu1_power4*3)") + "(du1_power4|du2_power4|du3_power4|du4_power4),\ + (iu1_power4*3|iu2_power4*3)") ; SPR move only executes in first IU. @@ -347,24 +352,19 @@ (define_insn_reservation "power4-sdiv" 33 (and (eq_attr "type" "sdiv,ddiv") (eq_attr "cpu" "power4")) - "(du1_power4,fpu1_power4*28)\ - |(du2_power4,fpu2_power4*28)\ - |(du3_power4,fpu2_power4*28)\ - |(du4_power4,fpu1_power4*28)") + "(du1_power4|du2_power4|du3_power4|du4_power4),\ + (fpu1_power4*28|fpu2_power4*28)") (define_insn_reservation "power4-sqrt" 40 (and (eq_attr "type" "ssqrt,dsqrt") (eq_attr "cpu" "power4")) - "(du1_power4,fpu1_power4*35)\ - |(du2_power4,fpu2_power4*35)\ - |(du3_power4,fpu2_power4*35)\ - |(du4_power4,fpu2_power4*35)") + "(du1_power4|du2_power4|du3_power4|du4_power4),\ + (fpu1_power4*35|fpu2_power4*35)") (define_insn_reservation "power4-isync" 2 (and (eq_attr "type" "isync") (eq_attr "cpu" "power4")) - "du1_power4+du2_power4+du3_power4+du4_power4,\ - lsu1_power4") + "du1_power4+du2_power4+du3_power4+du4_power4,lsu1_power4") ; VMX Index: gcc-4.3.4-20091019/gcc/config/rs6000/power5.md =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/power5.md 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/power5.md 2009-10-19 13:40:37.000000000 +0200 @@ -40,16 +40,12 @@ |(du4_power5,lsu1_power5)") (define_reservation "iq_power5" - "(du1_power5,iu1_power5)\ - |(du2_power5,iu2_power5)\ - |(du3_power5,iu2_power5)\ - |(du4_power5,iu1_power5)") + "(du1_power5|du2_power5|du3_power5|du4_power5),\ + (iu1_power5|iu2_power5)") (define_reservation "fpq_power5" - "(du1_power5,fpu1_power5)\ - |(du2_power5,fpu2_power5)\ - |(du3_power5,fpu2_power5)\ - |(du4_power5,fpu1_power5)") + "(du1_power5|du2_power5|du3_power5|du4_power5),\ + (fpu1_power5|fpu2_power5)") ; Dispatch slots are allocated in order conforming to program order. (absence_set "du1_power5" "du2_power5,du3_power5,du4_power5,du5_power5") @@ -105,10 +101,11 @@ (define_insn_reservation "power5-store" 12 (and (eq_attr "type" "store") (eq_attr "cpu" "power5")) - "(du1_power5,lsu1_power5,iu1_power5)\ - |(du2_power5,lsu2_power5,iu2_power5)\ - |(du3_power5,lsu2_power5,iu2_power5)\ - |(du4_power5,lsu1_power5,iu1_power5)") + "((du1_power5,lsu1_power5)\ + |(du2_power5,lsu2_power5)\ + |(du3_power5,lsu2_power5)\ + |(du4_power5,lsu1_power5)),\ + (iu1_power5|iu2_power5)") (define_insn_reservation "power5-store-update" 12 (and (eq_attr "type" "store_u") @@ -124,10 +121,11 @@ (define_insn_reservation "power5-fpstore" 12 (and (eq_attr "type" "fpstore") (eq_attr "cpu" "power5")) - "(du1_power5,lsu1_power5,fpu1_power5)\ - |(du2_power5,lsu2_power5,fpu2_power5)\ - |(du3_power5,lsu2_power5,fpu2_power5)\ - |(du4_power5,lsu1_power5,fpu1_power5)") + "((du1_power5,lsu1_power5)\ + |(du2_power5,lsu2_power5)\ + |(du3_power5,lsu2_power5)\ + |(du4_power5,lsu1_power5)),\ + (fpu1_power5|fpu2_power5)") (define_insn_reservation "power5-fpstore-update" 12 (and (eq_attr "type" "fpstore_u,fpstore_ux") @@ -151,22 +149,24 @@ (define_insn_reservation "power5-two" 2 (and (eq_attr "type" "two") (eq_attr "cpu" "power5")) - "(du1_power5+du2_power5,iu1_power5,nothing,iu2_power5)\ - |(du2_power5+du3_power5,iu2_power5,nothing,iu2_power5)\ - |(du3_power5+du4_power5,iu2_power5,nothing,iu1_power5)\ - |(du4_power5+du1_power5,iu1_power5,nothing,iu1_power5)") + "((du1_power5+du2_power5)\ + |(du2_power5+du3_power5)\ + |(du3_power5+du4_power5)\ + |(du4_power5+du1_power5)),\ + ((iu1_power5,nothing,iu2_power5)\ + |(iu2_power5,nothing,iu2_power5)\ + |(iu2_power5,nothing,iu1_power5)\ + |(iu1_power5,nothing,iu1_power5))") (define_insn_reservation "power5-three" 2 (and (eq_attr "type" "three") (eq_attr "cpu" "power5")) - "(du1_power5+du2_power5+du3_power5,\ - iu1_power5,nothing,iu2_power5,nothing,iu2_power5)\ - |(du2_power5+du3_power5+du4_power5,\ - iu2_power5,nothing,iu2_power5,nothing,iu1_power5)\ - |(du3_power5+du4_power5+du1_power5,\ - iu2_power5,nothing,iu1_power5,nothing,iu1_power5)\ - |(du4_power5+du1_power5+du2_power5,\ - iu1_power5,nothing,iu2_power5,nothing,iu2_power5)") + "(du1_power5+du2_power5+du3_power5|du2_power5+du3_power5+du4_power5\ + |du3_power5+du4_power5+du1_power5|du4_power5+du1_power5+du2_power5),\ + ((iu1_power5,nothing,iu2_power5,nothing,iu2_power5)\ + |(iu2_power5,nothing,iu2_power5,nothing,iu1_power5)\ + |(iu2_power5,nothing,iu1_power5,nothing,iu1_power5)\ + |(iu1_power5,nothing,iu2_power5,nothing,iu2_power5))") (define_insn_reservation "power5-insert" 4 (and (eq_attr "type" "insert_word") @@ -202,26 +202,17 @@ (define_insn_reservation "power5-lmul" 7 (and (eq_attr "type" "lmul") (eq_attr "cpu" "power5")) - "(du1_power5,iu1_power5*6)\ - |(du2_power5,iu2_power5*6)\ - |(du3_power5,iu2_power5*6)\ - |(du4_power5,iu1_power5*6)") + "(du1_power5|du2_power5|du3_power5|du4_power5),(iu1_power5*6|iu2_power5*6)") (define_insn_reservation "power5-imul" 5 (and (eq_attr "type" "imul") (eq_attr "cpu" "power5")) - "(du1_power5,iu1_power5*4)\ - |(du2_power5,iu2_power5*4)\ - |(du3_power5,iu2_power5*4)\ - |(du4_power5,iu1_power5*4)") + "(du1_power5|du2_power5|du3_power5|du4_power5),(iu1_power5*4|iu2_power5*4)") (define_insn_reservation "power5-imul3" 4 (and (eq_attr "type" "imul2,imul3") (eq_attr "cpu" "power5")) - "(du1_power5,iu1_power5*3)\ - |(du2_power5,iu2_power5*3)\ - |(du3_power5,iu2_power5*3)\ - |(du4_power5,iu1_power5*3)") + "(du1_power5|du2_power5|du3_power5|du4_power5),(iu1_power5*3|iu2_power5*3)") ; SPR move only executes in first IU. @@ -300,18 +291,14 @@ (define_insn_reservation "power5-sdiv" 33 (and (eq_attr "type" "sdiv,ddiv") (eq_attr "cpu" "power5")) - "(du1_power5,fpu1_power5*28)\ - |(du2_power5,fpu2_power5*28)\ - |(du3_power5,fpu2_power5*28)\ - |(du4_power5,fpu1_power5*28)") + "(du1_power5|du2_power5|du3_power5|du4_power5),\ + (fpu1_power5*28|fpu2_power5*28)") (define_insn_reservation "power5-sqrt" 40 (and (eq_attr "type" "ssqrt,dsqrt") (eq_attr "cpu" "power5")) - "(du1_power5,fpu1_power5*35)\ - |(du2_power5,fpu2_power5*35)\ - |(du3_power5,fpu2_power5*35)\ - |(du4_power5,fpu2_power5*35)") + "(du1_power5|du2_power5|du3_power5|du4_power5),\ + (fpu1_power5*35|fpu2_power5*35)") (define_insn_reservation "power5-isync" 2 (and (eq_attr "type" "isync") Index: gcc-4.3.4-20091019/gcc/config/rs6000/power7.md =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ gcc-4.3.4-20091019/gcc/config/rs6000/power7.md 2009-10-19 13:40:37.000000000 +0200 @@ -0,0 +1,318 @@ +;; Scheduling description for IBM POWER7 processor. +;; Copyright (C) 2009 Free Software Foundation, Inc. +;; +;; Contributed by Pat Haugen (pthaugen@us.ibm.com). + +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published +;; by the Free Software Foundation; either version 3, or (at your +;; option) any later version. +;; +;; GCC is distributed in the hope that it will be useful, but WITHOUT +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY +;; or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public +;; License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; <http://www.gnu.org/licenses/>. + +(define_automaton "power7iu,power7lsu,power7vsu,power7misc") + +(define_cpu_unit "iu1_power7,iu2_power7" "power7iu") +(define_cpu_unit "lsu1_power7,lsu2_power7" "power7lsu") +(define_cpu_unit "vsu1_power7,vsu2_power7" "power7vsu") +(define_cpu_unit "bpu_power7,cru_power7" "power7misc") +(define_cpu_unit "du1_power7,du2_power7,du3_power7,du4_power7,du5_power7" + "power7misc") + + +(define_reservation "DU_power7" + "du1_power7|du2_power7|du3_power7|du4_power7") + +(define_reservation "DU2F_power7" + "du1_power7+du2_power7") + +(define_reservation "DU4_power7" + "du1_power7+du2_power7+du3_power7+du4_power7") + +(define_reservation "FXU_power7" + "iu1_power7|iu2_power7") + +(define_reservation "VSU_power7" + "vsu1_power7|vsu2_power7") + +(define_reservation "LSU_power7" + "lsu1_power7|lsu2_power7") + + +; Dispatch slots are allocated in order conforming to program order. +(absence_set "du1_power7" "du2_power7,du3_power7,du4_power7,du5_power7") +(absence_set "du2_power7" "du3_power7,du4_power7,du5_power7") +(absence_set "du3_power7" "du4_power7,du5_power7") +(absence_set "du4_power7" "du5_power7") + + +; LS Unit +(define_insn_reservation "power7-load" 2 + (and (eq_attr "type" "load") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7") + +(define_insn_reservation "power7-load-ext" 3 + (and (eq_attr "type" "load_ext") + (eq_attr "cpu" "power7")) + "DU2F_power7,LSU_power7,FXU_power7") + +(define_insn_reservation "power7-load-update" 2 + (and (eq_attr "type" "load_u") + (eq_attr "cpu" "power7")) + "DU2F_power7,LSU_power7+FXU_power7") + +(define_insn_reservation "power7-load-update-indexed" 3 + (and (eq_attr "type" "load_ux") + (eq_attr "cpu" "power7")) + "DU4_power7,FXU_power7,LSU_power7+FXU_power7") + +(define_insn_reservation "power7-load-ext-update" 4 + (and (eq_attr "type" "load_ext_u") + (eq_attr "cpu" "power7")) + "DU2F_power7,LSU_power7+FXU_power7,FXU_power7") + +(define_insn_reservation "power7-load-ext-update-indexed" 4 + (and (eq_attr "type" "load_ext_ux") + (eq_attr "cpu" "power7")) + "DU4_power7,FXU_power7,LSU_power7+FXU_power7,FXU_power7") + +(define_insn_reservation "power7-fpload" 3 + (and (eq_attr "type" "fpload") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7") + +(define_insn_reservation "power7-fpload-update" 3 + (and (eq_attr "type" "fpload_u,fpload_ux") + (eq_attr "cpu" "power7")) + "DU2F_power7,LSU_power7+FXU_power7") + +(define_insn_reservation "power7-store" 6 ; store-forwarding latency + (and (eq_attr "type" "store") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7+FXU_power7") + +(define_insn_reservation "power7-store-update" 6 + (and (eq_attr "type" "store_u") + (eq_attr "cpu" "power7")) + "DU2F_power7,LSU_power7+FXU_power7,FXU_power7") + +(define_insn_reservation "power7-store-update-indexed" 6 + (and (eq_attr "type" "store_ux") + (eq_attr "cpu" "power7")) + "DU4_power7,LSU_power7+FXU_power7,FXU_power7") + +(define_insn_reservation "power7-fpstore" 6 + (and (eq_attr "type" "fpstore") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7+VSU_power7") + +(define_insn_reservation "power7-fpstore-update" 6 + (and (eq_attr "type" "fpstore_u,fpstore_ux") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7+VSU_power7+FXU_power7") + +(define_insn_reservation "power7-larx" 3 + (and (eq_attr "type" "load_l") + (eq_attr "cpu" "power7")) + "DU4_power7,LSU_power7") + +(define_insn_reservation "power7-stcx" 10 + (and (eq_attr "type" "store_c") + (eq_attr "cpu" "power7")) + "DU4_power7,LSU_power7") + +(define_insn_reservation "power7-vecload" 3 + (and (eq_attr "type" "vecload") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7") + +(define_insn_reservation "power7-vecstore" 6 + (and (eq_attr "type" "vecstore") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7+VSU_power7") + +(define_insn_reservation "power7-sync" 11 + (and (eq_attr "type" "sync") + (eq_attr "cpu" "power7")) + "DU4_power7,LSU_power7") + + +; FX Unit +(define_insn_reservation "power7-integer" 1 + (and (eq_attr "type" "integer,insert_word,insert_dword,shift,trap,\ + var_shift_rotate,exts") + (eq_attr "cpu" "power7")) + "DU_power7,FXU_power7") + +(define_insn_reservation "power7-cntlz" 2 + (and (eq_attr "type" "cntlz") + (eq_attr "cpu" "power7")) + "DU_power7,FXU_power7") + +(define_insn_reservation "power7-two" 2 + (and (eq_attr "type" "two") + (eq_attr "cpu" "power7")) + "DU_power7+DU_power7,FXU_power7,FXU_power7") + +(define_insn_reservation "power7-three" 3 + (and (eq_attr "type" "three") + (eq_attr "cpu" "power7")) + "DU_power7+DU_power7+DU_power7,FXU_power7,FXU_power7,FXU_power7") + +(define_insn_reservation "power7-cmp" 1 + (and (eq_attr "type" "cmp,fast_compare") + (eq_attr "cpu" "power7")) + "DU_power7,FXU_power7") + +(define_insn_reservation "power7-compare" 2 + (and (eq_attr "type" "compare,delayed_compare,var_delayed_compare") + (eq_attr "cpu" "power7")) + "DU2F_power7,FXU_power7,FXU_power7") + +(define_bypass 3 "power7-cmp,power7-compare" "power7-crlogical,power7-delayedcr") + +(define_insn_reservation "power7-mul" 4 + (and (eq_attr "type" "imul,imul2,imul3,lmul") + (eq_attr "cpu" "power7")) + "DU_power7,FXU_power7") + +(define_insn_reservation "power7-mul-compare" 5 + (and (eq_attr "type" "imul_compare,lmul_compare") + (eq_attr "cpu" "power7")) + "DU2F_power7,FXU_power7,nothing*3,FXU_power7") + +(define_insn_reservation "power7-idiv" 36 + (and (eq_attr "type" "idiv") + (eq_attr "cpu" "power7")) + "DU2F_power7,iu1_power7*36|iu2_power7*36") + +(define_insn_reservation "power7-ldiv" 68 + (and (eq_attr "type" "ldiv") + (eq_attr "cpu" "power7")) + "DU2F_power7,iu1_power7*68|iu2_power7*68") + +(define_insn_reservation "power7-isync" 1 ; + (and (eq_attr "type" "isync") + (eq_attr "cpu" "power7")) + "DU4_power7,FXU_power7") + + +; CR Unit +(define_insn_reservation "power7-mtjmpr" 4 + (and (eq_attr "type" "mtjmpr") + (eq_attr "cpu" "power7")) + "du1_power7,FXU_power7") + +(define_insn_reservation "power7-mfjmpr" 5 + (and (eq_attr "type" "mfjmpr") + (eq_attr "cpu" "power7")) + "du1_power7,cru_power7+FXU_power7") + +(define_insn_reservation "power7-crlogical" 3 + (and (eq_attr "type" "cr_logical") + (eq_attr "cpu" "power7")) + "du1_power7,cru_power7") + +(define_insn_reservation "power7-delayedcr" 3 + (and (eq_attr "type" "delayed_cr") + (eq_attr "cpu" "power7")) + "du1_power7,cru_power7") + +(define_insn_reservation "power7-mfcr" 6 + (and (eq_attr "type" "mfcr") + (eq_attr "cpu" "power7")) + "du1_power7,cru_power7") + +(define_insn_reservation "power7-mfcrf" 3 + (and (eq_attr "type" "mfcrf") + (eq_attr "cpu" "power7")) + "du1_power7,cru_power7") + +(define_insn_reservation "power7-mtcr" 3 + (and (eq_attr "type" "mtcr") + (eq_attr "cpu" "power7")) + "DU4_power7,cru_power7+FXU_power7") + + +; BR Unit +; Branches take dispatch Slot 4. The presence_sets prevent other insn from +; grabbing previous dispatch slots once this is assigned. +(define_insn_reservation "power7-branch" 3 + (and (eq_attr "type" "jmpreg,branch") + (eq_attr "cpu" "power7")) + "(du5_power7\ + |du4_power7+du5_power7\ + |du3_power7+du4_power7+du5_power7\ + |du2_power7+du3_power7+du4_power7+du5_power7\ + |du1_power7+du2_power7+du3_power7+du4_power7+du5_power7),bpu_power7") + + +; VS Unit (includes FP/VSX/VMX/DFP) +(define_insn_reservation "power7-fp" 6 + (and (eq_attr "type" "fp,dmul") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_bypass 8 "power7-fp" "power7-branch") + +(define_insn_reservation "power7-fpcompare" 4 + (and (eq_attr "type" "fpcompare") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_insn_reservation "power7-sdiv" 26 + (and (eq_attr "type" "sdiv") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_insn_reservation "power7-ddiv" 32 + (and (eq_attr "type" "ddiv") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_insn_reservation "power7-sqrt" 31 + (and (eq_attr "type" "ssqrt") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_insn_reservation "power7-dsqrt" 43 + (and (eq_attr "type" "dsqrt") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_insn_reservation "power7-vecsimple" 2 + (and (eq_attr "type" "vecsimple") + (eq_attr "cpu" "power7")) + "du1_power7,VSU_power7") + +(define_insn_reservation "power7-veccmp" 7 + (and (eq_attr "type" "veccmp") + (eq_attr "cpu" "power7")) + "du1_power7,VSU_power7") + +(define_insn_reservation "power7-vecfloat" 7 + (and (eq_attr "type" "vecfloat") + (eq_attr "cpu" "power7")) + "du1_power7,VSU_power7") + +(define_bypass 6 "power7-vecfloat" "power7-vecfloat") + +(define_insn_reservation "power7-veccomplex" 7 + (and (eq_attr "type" "veccomplex") + (eq_attr "cpu" "power7")) + "du1_power7,VSU_power7") + +(define_insn_reservation "power7-vecperm" 3 + (and (eq_attr "type" "vecperm") + (eq_attr "cpu" "power7")) + "du2_power7,VSU_power7") Index: gcc-4.3.4-20091019/gcc/config/rs6000/ppc-asm.h =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/ppc-asm.h 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/ppc-asm.h 2009-10-19 13:40:37.000000000 +0200 @@ -1,4 +1,28 @@ -/* PowerPC asm definitions for GNU C. */ +/* PowerPC asm definitions for GNU C. + +Copyright (C) 2002, 2003, 2008, 2009 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +<http://www.gnu.org/licenses/>. */ + /* Under winnt, 1) gas supports the following as names and 2) in particular defining "toc" breaks the FUNC_START macro as ".toc" becomes ".2" */ @@ -63,7 +87,7 @@ #define f16 16 #define f17 17 #define f18 18 -#define f19 19 +#define f19 19 #define f20 20 #define f21 21 #define f22 22 @@ -77,6 +101,143 @@ #define f30 30 #define f31 31 +#ifdef __VSX__ +#define f32 32 +#define f33 33 +#define f34 34 +#define f35 35 +#define f36 36 +#define f37 37 +#define f38 38 +#define f39 39 +#define f40 40 +#define f41 41 +#define f42 42 +#define f43 43 +#define f44 44 +#define f45 45 +#define f46 46 +#define f47 47 +#define f48 48 +#define f49 49 +#define f50 30 +#define f51 51 +#define f52 52 +#define f53 53 +#define f54 54 +#define f55 55 +#define f56 56 +#define f57 57 +#define f58 58 +#define f59 59 +#define f60 60 +#define f61 61 +#define f62 62 +#define f63 63 +#endif + +#ifdef __ALTIVEC__ +#define v0 0 +#define v1 1 +#define v2 2 +#define v3 3 +#define v4 4 +#define v5 5 +#define v6 6 +#define v7 7 +#define v8 8 +#define v9 9 +#define v10 10 +#define v11 11 +#define v12 12 +#define v13 13 +#define v14 14 +#define v15 15 +#define v16 16 +#define v17 17 +#define v18 18 +#define v19 19 +#define v20 20 +#define v21 21 +#define v22 22 +#define v23 23 +#define v24 24 +#define v25 25 +#define v26 26 +#define v27 27 +#define v28 28 +#define v29 29 +#define v30 30 +#define v31 31 +#endif + +#ifdef __VSX__ +#define vs0 0 +#define vs1 1 +#define vs2 2 +#define vs3 3 +#define vs4 4 +#define vs5 5 +#define vs6 6 +#define vs7 7 +#define vs8 8 +#define vs9 9 +#define vs10 10 +#define vs11 11 +#define vs12 12 +#define vs13 13 +#define vs14 14 +#define vs15 15 +#define vs16 16 +#define vs17 17 +#define vs18 18 +#define vs19 19 +#define vs20 20 +#define vs21 21 +#define vs22 22 +#define vs23 23 +#define vs24 24 +#define vs25 25 +#define vs26 26 +#define vs27 27 +#define vs28 28 +#define vs29 29 +#define vs30 30 +#define vs31 31 +#define vs32 32 +#define vs33 33 +#define vs34 34 +#define vs35 35 +#define vs36 36 +#define vs37 37 +#define vs38 38 +#define vs39 39 +#define vs40 40 +#define vs41 41 +#define vs42 42 +#define vs43 43 +#define vs44 44 +#define vs45 45 +#define vs46 46 +#define vs47 47 +#define vs48 48 +#define vs49 49 +#define vs50 30 +#define vs51 51 +#define vs52 52 +#define vs53 53 +#define vs54 54 +#define vs55 55 +#define vs56 56 +#define vs57 57 +#define vs58 58 +#define vs59 59 +#define vs60 60 +#define vs61 61 +#define vs62 62 +#define vs63 63 +#endif + /* * Macros to glue together two tokens. */ @@ -110,6 +271,11 @@ name: \ .globl GLUE(.,name); \ GLUE(.,name): +#define HIDDEN_FUNC(name) \ + FUNC_START(name) \ + .hidden name; \ + .hidden GLUE(.,name); + #define FUNC_END(name) \ GLUE(.L,name): \ .size GLUE(.,name),GLUE(.L,name)-GLUE(.,name) @@ -136,6 +302,11 @@ name: \ .globl GLUE(.,name); \ GLUE(.,name): +#define HIDDEN_FUNC(name) \ + FUNC_START(name) \ + .hidden name; \ + .hidden GLUE(.,name); + #define FUNC_END(name) \ GLUE(.L,name): \ .size GLUE(.,name),GLUE(.L,name)-GLUE(.,name) @@ -153,11 +324,34 @@ GLUE(.L,name): \ .globl FUNC_NAME(name); \ FUNC_NAME(name): +#define HIDDEN_FUNC(name) \ + FUNC_START(name) \ + .hidden FUNC_NAME(name); + #define FUNC_END(name) \ GLUE(.L,name): \ .size FUNC_NAME(name),GLUE(.L,name)-FUNC_NAME(name) #endif +#ifdef IN_GCC +/* For HAVE_GAS_CFI_DIRECTIVE. */ +#include "auto-host.h" + +#ifdef HAVE_GAS_CFI_DIRECTIVE +# define CFI_STARTPROC .cfi_startproc +# define CFI_ENDPROC .cfi_endproc +# define CFI_OFFSET(reg, off) .cfi_offset reg, off +# define CFI_DEF_CFA_REGISTER(reg) .cfi_def_cfa_register reg +# define CFI_RESTORE(reg) .cfi_restore reg +#else +# define CFI_STARTPROC +# define CFI_ENDPROC +# define CFI_OFFSET(reg, off) +# define CFI_DEF_CFA_REGISTER(reg) +# define CFI_RESTORE(reg) +#endif +#endif + #if defined __linux__ && !defined __powerpc64__ .section .note.GNU-stack .previous Index: gcc-4.3.4-20091019/gcc/config/rs6000/predicates.md =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/predicates.md 2009-10-19 13:39:52.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/config/rs6000/predicates.md 2009-10-19 13:40:37.000000000 +0200 @@ -38,6 +38,37 @@ || ALTIVEC_REGNO_P (REGNO (op)) || REGNO (op) > LAST_VIRTUAL_REGISTER"))) +;; Return 1 if op is a VSX register. +(define_predicate "vsx_register_operand" + (and (match_operand 0 "register_operand") + (match_test "GET_CODE (op) != REG + || VSX_REGNO_P (REGNO (op)) + || REGNO (op) > LAST_VIRTUAL_REGISTER"))) + +;; Return 1 if op is a vector register that operates on floating point vectors +;; (either altivec or VSX). +(define_predicate "vfloat_operand" + (and (match_operand 0 "register_operand") + (match_test "GET_CODE (op) != REG + || VFLOAT_REGNO_P (REGNO (op)) + || REGNO (op) > LAST_VIRTUAL_REGISTER"))) + +;; Return 1 if op is a vector register that operates on integer vectors +;; (only altivec, VSX doesn't support integer vectors) +(define_predicate "vint_operand" + (and (match_operand 0 "register_operand") + (match_test "GET_CODE (op) != REG + || VINT_REGNO_P (REGNO (op)) + || REGNO (op) > LAST_VIRTUAL_REGISTER"))) + +;; Return 1 if op is a vector register to do logical operations on (and, or, +;; xor, etc.) +(define_predicate "vlogical_operand" + (and (match_operand 0 "register_operand") + (match_test "GET_CODE (op) != REG + || VLOGICAL_REGNO_P (REGNO (op)) + || REGNO (op) > LAST_VIRTUAL_REGISTER"))) + ;; Return 1 if op is XER register. (define_predicate "xer_operand" (and (match_code "reg") @@ -200,7 +231,8 @@ return 0; /* Consider all constants with -msoft-float to be easy. */ - if ((TARGET_SOFT_FLOAT || TARGET_E500_SINGLE) + if ((TARGET_SOFT_FLOAT || TARGET_E500_SINGLE + || (TARGET_HARD_FLOAT && (TARGET_SINGLE_FLOAT && ! TARGET_DOUBLE_FLOAT))) && mode != DImode) return 1; @@ -233,6 +265,10 @@ && num_insns_constant_wide ((HOST_WIDE_INT) k[3]) == 1); case DFmode: + /* The constant 0.f is easy under VSX. */ + if (op == CONST0_RTX (DFmode) && VECTOR_UNIT_VSX_P (DFmode)) + return 1; + /* Force constants to memory before reload to utilize compress_float_constant. Avoid this when flag_unsafe_math_optimizations is enabled @@ -291,6 +327,9 @@ if (TARGET_PAIRED_FLOAT) return false; + if ((VSX_VECTOR_MODE (mode) || mode == TImode) && zero_constant (op, mode)) + return true; + if (ALTIVEC_VECTOR_MODE (mode)) { if (zero_constant (op, mode)) @@ -338,6 +377,16 @@ return EASY_VECTOR_15_ADD_SELF (val); }) +;; Same as easy_vector_constant but only for EASY_VECTOR_MSB. +(define_predicate "easy_vector_constant_msb" + (and (match_code "const_vector") + (and (match_test "TARGET_ALTIVEC") + (match_test "easy_altivec_constant (op, mode)"))) +{ + HOST_WIDE_INT val = const_vector_elt_as_int (op, GET_MODE_NUNITS (mode) - 1); + return EASY_VECTOR_MSB (val, GET_MODE_INNER (mode)); +}) + ;; Return 1 if operand is constant zero (scalars and vectors). (define_predicate "zero_constant" (and (match_code "const_int,const_double,const_vector") @@ -366,25 +415,34 @@ ;; Return 1 if the operand is an offsettable memory operand. (define_predicate "offsettable_mem_operand" (and (match_operand 0 "memory_operand") - (match_test "GET_CODE (XEXP (op, 0)) != PRE_INC - && GET_CODE (XEXP (op, 0)) != PRE_DEC - && GET_CODE (XEXP (op, 0)) != PRE_MODIFY"))) + (match_test "offsettable_nonstrict_memref_p (op)"))) ;; Return 1 if the operand is a memory operand with an address divisible by 4 (define_predicate "word_offset_memref_operand" - (and (match_operand 0 "memory_operand") - (match_test "GET_CODE (XEXP (op, 0)) != PLUS - || ! REG_P (XEXP (XEXP (op, 0), 0)) - || GET_CODE (XEXP (XEXP (op, 0), 1)) != CONST_INT - || INTVAL (XEXP (XEXP (op, 0), 1)) % 4 == 0"))) + (match_operand 0 "memory_operand") +{ + /* Address inside MEM. */ + op = XEXP (op, 0); + + /* Extract address from auto-inc/dec. */ + if (GET_CODE (op) == PRE_INC + || GET_CODE (op) == PRE_DEC) + op = XEXP (op, 0); + else if (GET_CODE (op) == PRE_MODIFY) + op = XEXP (op, 1); + + return (GET_CODE (op) != PLUS + || ! REG_P (XEXP (op, 0)) + || GET_CODE (XEXP (op, 1)) != CONST_INT + || INTVAL (XEXP (op, 1)) % 4 == 0); +}) ;; Return 1 if the operand is an indexed or indirect memory operand. (define_predicate "indexed_or_indirect_operand" (match_code "mem") { op = XEXP (op, 0); - if (TARGET_ALTIVEC - && ALTIVEC_VECTOR_MODE (mode) + if (VECTOR_MEM_ALTIVEC_P (mode) && GET_CODE (op) == AND && GET_CODE (XEXP (op, 1)) == CONST_INT && INTVAL (XEXP (op, 1)) == -16) @@ -393,6 +451,23 @@ return indexed_or_indirect_address (op, mode); }) +;; Return 1 if the operand is an indexed or indirect memory operand with an +;; AND -16 in it, used to recognize when we need to switch to Altivec loads +;; to realign loops instead of VSX (altivec silently ignores the bottom bits, +;; while VSX uses the full address and traps) +(define_predicate "altivec_indexed_or_indirect_operand" + (match_code "mem") +{ + op = XEXP (op, 0); + if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) + && GET_CODE (op) == AND + && GET_CODE (XEXP (op, 1)) == CONST_INT + && INTVAL (XEXP (op, 1)) == -16) + return indexed_or_indirect_address (XEXP (op, 0), mode); + + return 0; +}) + ;; Return 1 if the operand is an indexed or indirect address. (define_special_predicate "indexed_or_indirect_address" (and (match_test "REG_P (op) @@ -923,7 +998,7 @@ rtx elt; int count = XVECLEN (op, 0); - if (count != 55) + if (count != 54) return 0; index = 0; @@ -972,9 +1047,8 @@ || GET_MODE (SET_SRC (elt)) != Pmode) return 0; - if (GET_CODE (XVECEXP (op, 0, index++)) != USE - || GET_CODE (XVECEXP (op, 0, index++)) != USE - || GET_CODE (XVECEXP (op, 0, index++)) != CLOBBER) + if (GET_CODE (XVECEXP (op, 0, index++)) != SET + || GET_CODE (XVECEXP (op, 0, index++)) != SET) return 0; return 1; }) Index: gcc-4.3.4-20091019/gcc/config/rs6000/rs6000-builtin.def =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ gcc-4.3.4-20091019/gcc/config/rs6000/rs6000-builtin.def 2009-10-19 13:40:37.000000000 +0200 @@ -0,0 +1,990 @@ +/* Builtin functions for rs6000/powerpc. + Copyright (C) 2009 + Free Software Foundation, Inc. + Contributed by Michael Meissner (meissner@linux.vnet.ibm.com) + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + <http://www.gnu.org/licenses/>. */ + +/* Before including this file, two macros must be defined: + RS6000_BUILTIN -- 2 arguments, the enum name, and classification + RS6000_BUILTIN_EQUATE -- 2 arguments, enum name and value */ + +/* AltiVec builtins. */ +RS6000_BUILTIN(ALTIVEC_BUILTIN_ST_INTERNAL_4si, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LD_INTERNAL_4si, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ST_INTERNAL_8hi, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LD_INTERNAL_8hi, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ST_INTERNAL_16qi, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LD_INTERNAL_16qi, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ST_INTERNAL_4sf, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LD_INTERNAL_4sf, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDUBM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDUHM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDUWM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDCUW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDUBS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDSBS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDUHS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDSHS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDUWS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDSWS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VAND, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VANDC, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VAVGUB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VAVGSB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VAVGUH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VAVGSH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VAVGUW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VAVGSW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCFUX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCFSX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCTSXS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCTUXS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPBFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPEQUB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPEQUH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPEQUW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPEQFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGEFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTUB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTSB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTUH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTSH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTUW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTSW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEXPTEFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VLOGEFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMADDFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMAXUB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMAXSB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMAXUH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMAXSH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMAXUW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMAXSW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMAXFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMHADDSHS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMHRADDSHS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMLADDUHM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMRGHB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMRGHH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMRGHW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMRGLB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMRGLH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMRGLW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMSUMUBM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMSUMMBM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMSUMUHM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMSUMSHM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMSUMUHS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMSUMSHS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMINUB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMINSB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMINUH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMINSH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMINUW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMINSW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMINFP, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULEUB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULEUB_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULESB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULEUH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULEUH_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULESH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULOUB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULOUB_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULOSB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULOUH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULOUH_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VMULOSH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VNMSUBFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VNOR, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VOR, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSEL_2DF, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSEL_2DI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSEL_4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSEL_4SF, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSEL_8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSEL_16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSEL_2DI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSEL_4SI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSEL_8HI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSEL_16QI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPERM_2DF, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPERM_2DI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPERM_4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPERM_4SF, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPERM_8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPERM_16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPERM_2DI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPERM_4SI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPERM_8HI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPERM_16QI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPKUHUM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPKUWUM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPKPX, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPKUHSS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPKSHSS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPKUWSS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPKSWSS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPKUHUS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPKSHUS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPKUWUS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VPKSWUS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VREFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VRFIM, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VRFIN, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VRFIP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VRFIZ, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VRLB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VRLH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VRLW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VRSQRTEFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSLB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSLH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSLW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSL, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSLO, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSPLTB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSPLTH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSPLTW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSPLTISB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSPLTISH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSPLTISW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSRB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSRH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSRW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSRAB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSRAH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSRAW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSR, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSRO, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUBUBM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUBUHM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUBUWM, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUBFP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUBCUW, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUBUBS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUBSBS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUBUHS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUBSHS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUBUWS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUBSWS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUM4UBS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUM4SBS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUM4SHS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUM2SWS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSUMSWS, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VXOR, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSLDOI_16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSLDOI_8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSLDOI_4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VSLDOI_4SF, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VUPKHSB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VUPKHPX, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VUPKHSH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VUPKLSB, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VUPKLPX, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VUPKLSH, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_MTVSCR, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_MFVSCR, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_DSSALL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_DSS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LVSL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LVSR, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_DSTT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_DSTST, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_DSTSTT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_DST, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LVEBX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LVEHX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LVEWX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LVXL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LVX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_STVX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LVLX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LVLXL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LVRX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LVRXL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_STVEBX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_STVEHX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_STVEWX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_STVXL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_STVLX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_STVLXL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_STVRX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_STVRXL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPBFP_P, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPEQFP_P, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPEQUB_P, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPEQUH_P, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPEQUW_P, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGEFP_P, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTFP_P, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTSB_P, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTSH_P, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTSW_P, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTUB_P, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTUH_P, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGTUW_P, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ABSS_V4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ABSS_V8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ABSS_V16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ABS_V4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ABS_V4SF, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ABS_V8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ABS_V16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_MASK_FOR_LOAD, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_MASK_FOR_STORE, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_INIT_V4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_INIT_V8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_INIT_V16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_INIT_V4SF, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SET_V4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SET_V8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SET_V16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SET_V4SF, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_EXT_V4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_EXT_V8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_EXT_V16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_EXT_V4SF, RS6000_BTC_CONST) +RS6000_BUILTIN(ALTIVEC_BUILTIN_COPYSIGN_V4SF, RS6000_BTC_CONST) + +/* Altivec overloaded builtins. */ +/* For now, don't set the classification for overloaded functions. + The function should be converted to the type specific instruction + before we get to the point about classifying the builtin type. */ +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPEQ_P, RS6000_BTC_MISC) +RS6000_BUILTIN_EQUATE(ALTIVEC_BUILTIN_OVERLOADED_FIRST, + ALTIVEC_BUILTIN_VCMPEQ_P) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGT_P, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VCMPGE_P, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_ABS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_ABSS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_ADD, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_ADDC, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_ADDS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_AND, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_ANDC, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_AVG, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_EXTRACT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CEIL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CMPB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CMPEQ, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CMPEQUB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CMPEQUH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CMPEQUW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CMPGE, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CMPGT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CMPLE, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CMPLT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_COPYSIGN, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CTF, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CTS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_CTU, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_DST, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_DSTST, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_DSTSTT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_DSTT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_EXPTE, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_FLOOR, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LD, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LDE, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LDL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LOGE, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LVEBX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LVEHX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LVEWX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LVLX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LVLXL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LVRX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LVRXL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LVSL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_LVSR, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MADD, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MADDS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MAX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MERGEH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MERGEL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MIN, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MLADD, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MPERM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MRADDS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MRGHB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MRGHH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MRGHW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MRGLB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MRGLH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MRGLW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MSUM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MSUMS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MTVSCR, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MULE, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_MULO, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_NEARBYINT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_NMSUB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_NOR, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_OR, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_PACK, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_PACKPX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_PACKS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_PACKSU, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_PERM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_RE, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_RL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_RINT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_ROUND, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_RSQRTE, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SEL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SLD, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SLL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SLO, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SPLAT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SPLAT_S16, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SPLAT_S32, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SPLAT_S8, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SPLAT_U16, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SPLAT_U32, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SPLAT_U8, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SPLTB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SPLTH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SPLTW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SQRT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SR, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SRA, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SRL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SRO, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_ST, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_STE, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_STL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_STVEBX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_STVEHX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_STVEWX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_STVLX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_STVLXL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_STVRX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_STVRXL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SUB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SUBC, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SUBS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SUM2S, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SUM4S, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SUMS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_TRUNC, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_UNPACKH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_UNPACKL, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VADDFP, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VADDSBS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VADDSHS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VADDSWS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VADDUBM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VADDUBS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VADDUHM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VADDUHS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VADDUWM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VADDUWS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VAVGSB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VAVGSH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VAVGSW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VAVGUB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VAVGUH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VAVGUW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCFSX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCFUX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCMPEQFP, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCMPEQUB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCMPEQUH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCMPEQUW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCMPGTFP, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCMPGTSB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCMPGTSH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCMPGTSW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCMPGTUB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCMPGTUH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VCMPGTUW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMAXFP, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMAXSB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMAXSH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMAXSW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMAXUB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMAXUH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMAXUW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMINFP, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMINSB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMINSH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMINSW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMINUB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMINUH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMINUW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMRGHB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMRGHH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMRGHW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMRGLB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMRGLH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMRGLW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMSUMMBM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMSUMSHM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMSUMSHS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMSUMUBM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMSUMUHM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMSUMUHS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMULESB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMULESH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMULEUB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMULEUH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMULOSB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMULOSH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMULOUB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VMULOUH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VPKSHSS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VPKSHUS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VPKSWSS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VPKSWUS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VPKUHUM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VPKUHUS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VPKUWUM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VPKUWUS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VRLB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VRLH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VRLW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSLB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSLH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSLW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSPLTB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSPLTH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSPLTW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSRAB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSRAH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSRAW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSRB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSRH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSRW, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUBFP, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUBSBS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUBSHS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUBSWS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUBUBM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUBUBS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUBUHM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUBUHS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUBUWM, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUBUWS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUM4SBS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUM4SHS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VSUM4UBS, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VUPKHPX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VUPKHSB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VUPKHSH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VUPKLPX, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VUPKLSB, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_VUPKLSH, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_XOR, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_STEP, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_PROMOTE, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_INSERT, RS6000_BTC_MISC) +RS6000_BUILTIN(ALTIVEC_BUILTIN_VEC_SPLATS, RS6000_BTC_MISC) +RS6000_BUILTIN_EQUATE(ALTIVEC_BUILTIN_OVERLOADED_LAST, + ALTIVEC_BUILTIN_VEC_SPLATS) + +/* SPE builtins. */ +RS6000_BUILTIN(SPE_BUILTIN_EVADDW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVAND, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVANDC, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVDIVWS, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVDIVWU, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVEQV, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSADD, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSDIV, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSMUL, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSSUB, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLDDX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLDHX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLDWX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLHHESPLATX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLHHOSSPLATX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLHHOUSPLATX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLWHEX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLWHOSX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLWHOUX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLWHSPLATX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLWWSPLATX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMERGEHI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMERGEHILO, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMERGELO, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMERGELOHI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEGSMFAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEGSMFAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEGSMIAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEGSMIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEGUMIAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEGUMIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESMF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESMFA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESMFAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESMFANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESMI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESMIA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESMIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESMIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESSF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESSFA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESSFAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESSFANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESSIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHESSIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEUMI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEUMIA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEUMIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEUMIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEUSIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHEUSIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOGSMFAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOGSMFAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOGSMIAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOGSMIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOGUMIAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOGUMIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSMF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSMFA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSMFAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSMFANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSMI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSMIA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSMIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSMIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSSF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSSFA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSSFAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSSFANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSSIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOSSIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOUMI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOUMIA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOUMIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOUMIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOUSIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMHOUSIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSMF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSMFA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSMI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSMIA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSSF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSSFA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHUMI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHUMIA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWLSMIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWLSMIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWLSSIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWLSSIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWLUMI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWLUMIA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWLUMIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWLUMIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWLUSIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWLUSIANW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSMF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSMFA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSMFAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSMFAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSMI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSMIA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSMIAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSMIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSSFAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSSF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSSFA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSSFAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWSSFAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWUMI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWUMIA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWUMIAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWUMIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVNAND, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVNOR, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVOR, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVORC, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVRLW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSLW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSRWS, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSRWU, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTDDX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTDHX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTDWX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTWHEX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTWHOX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTWWEX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTWWOX, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSUBFW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVXOR, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVABS, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVADDSMIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVADDSSIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVADDUMIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVADDUSIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVCNTLSW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVCNTLZW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVEXTSB, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVEXTSH, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSABS, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCFSF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCFSI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCFUF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCFUI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCTSF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCTSI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCTSIZ, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCTUF, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCTUI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCTUIZ, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSNABS, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSNEG, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMRA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVNEG, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVRNDW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSUBFSMIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSUBFSSIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSUBFUMIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSUBFUSIAAW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVADDIW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLDD, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLDH, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLDW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLHHESPLAT, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLHHOSSPLAT, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLHHOUSPLAT, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLWHE, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLWHOS, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLWHOU, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLWHSPLAT, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVLWWSPLAT, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVRLWI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSLWI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSRWIS, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSRWIU, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTDD, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTDH, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTDW, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTWHE, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTWHO, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTWWE, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSTWWO, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSUBIFW, RS6000_BTC_MISC) + + /* Compares. */ +RS6000_BUILTIN(SPE_BUILTIN_EVCMPEQ, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVCMPGTS, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVCMPGTU, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVCMPLTS, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVCMPLTU, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCMPEQ, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCMPGT, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSCMPLT, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSTSTEQ, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSTSTGT, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVFSTSTLT, RS6000_BTC_MISC) + +/* EVSEL compares. */ +RS6000_BUILTIN(SPE_BUILTIN_EVSEL_CMPEQ, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSEL_CMPGTS, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSEL_CMPGTU, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSEL_CMPLTS, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSEL_CMPLTU, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSEL_FSCMPEQ, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSEL_FSCMPGT, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSEL_FSCMPLT, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSEL_FSTSTEQ, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSEL_FSTSTGT, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSEL_FSTSTLT, RS6000_BTC_MISC) + +RS6000_BUILTIN(SPE_BUILTIN_EVSPLATFI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVSPLATI, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSSMAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSMFAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSMIAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHUSIAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHUMIAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSSFAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSSIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSMFAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHSMIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHUSIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHUMIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHGSSFAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHGSMFAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHGSMIAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHGUMIAA, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHGSSFAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHGSMFAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHGSMIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_EVMWHGUMIAN, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_MTSPEFSCR, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_MFSPEFSCR, RS6000_BTC_MISC) +RS6000_BUILTIN(SPE_BUILTIN_BRINC, RS6000_BTC_MISC) + +/* PAIRED builtins. */ +RS6000_BUILTIN(PAIRED_BUILTIN_DIVV2SF3, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_ABSV2SF2, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_NEGV2SF2, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_SQRTV2SF2, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_ADDV2SF3, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_SUBV2SF3, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_RESV2SF2, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_MULV2SF3, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_MSUB, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_MADD, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_NMSUB, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_NMADD, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_NABSV2SF2, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_SUM0, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_SUM1, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_MULS0, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_MULS1, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_MERGE00, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_MERGE01, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_MERGE10, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_MERGE11, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_MADDS0, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_MADDS1, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_STX, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_LX, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_SELV2SF4, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_CMPU0, RS6000_BTC_MISC) +RS6000_BUILTIN(PAIRED_BUILTIN_CMPU1, RS6000_BTC_MISC) + + /* VSX builtins. */ +RS6000_BUILTIN(VSX_BUILTIN_LXSDX, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_LXVD2X, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_LXVDSX, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_LXVW4X, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_STXSDX, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_STXVD2X, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_STXVW4X, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_XSABSDP, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XSADDDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSCMPODP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSCMPUDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSCPSGNDP, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XSCVDPSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSCVDPSXDS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSCVDPSXWS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSCVDPUXDS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSCVDPUXWS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSCVSPDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSCVSXDDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSCVUXDDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSDIVDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSMADDADP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSMADDMDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSMAXDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSMINDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSMOVDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSMSUBADP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSMSUBMDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSMULDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSNABSDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSNEGDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSNMADDADP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSNMADDMDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSNMSUBADP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSNMSUBMDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSRDPI, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSRDPIC, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSRDPIM, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSRDPIP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSRDPIZ, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSREDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSRSQRTEDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSSQRTDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSSUBDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_CPSGNDP, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_CPSGNSP, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XSTDIVDP_FE, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSTDIVDP_FG, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSTSQRTDP_FE, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XSTSQRTDP_FG, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVABSDP, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XVABSSP, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XVADDDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVADDSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPEQDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPEQSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPGEDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPGESP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPGTDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPGTSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPEQDP_P, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPEQSP_P, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPGEDP_P, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPGESP_P, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPGTDP_P, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCMPGTSP_P, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCPSGNDP, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XVCPSGNSP, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XVCVDPSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVDPSXDS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVDPSXWS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVDPUXDS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVDPUXDS_UNS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVDPUXWS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVSPDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVSPSXDS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVSPSXWS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVSPUXDS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVSPUXWS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVSXDDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVSXDSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVSXWDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVSXWSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVUXDDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVUXDDP_UNS, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVUXDSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVUXWDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVCVUXWSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVDIVDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVDIVSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVMADDDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVMADDSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVMAXDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVMAXSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVMINDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVMINSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVMSUBDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVMSUBSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVMULDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVMULSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVNABSDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVNABSSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVNEGDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVNEGSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVNMADDDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVNMADDSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVNMSUBDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVNMSUBSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRDPI, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRDPIC, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRDPIM, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRDPIP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRDPIZ, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVREDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRESP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRSPI, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRSPIC, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRSPIM, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRSPIP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRSPIZ, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRSQRTEDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVRSQRTESP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVSQRTDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVSQRTSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVSUBDP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVSUBSP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVTDIVDP_FE, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVTDIVDP_FG, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVTDIVSP_FE, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVTDIVSP_FG, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVTSQRTDP_FE, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVTSQRTDP_FG, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVTSQRTSP_FE, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XVTSQRTSP_FG, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XXSEL_2DI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSEL_2DF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSEL_4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSEL_4SF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSEL_8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSEL_16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSEL_2DI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSEL_4SI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSEL_8HI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSEL_16QI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VPERM_2DI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VPERM_2DF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VPERM_4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VPERM_4SF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VPERM_8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VPERM_16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VPERM_2DI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VPERM_4SI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VPERM_8HI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VPERM_16QI_UNS, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXPERMDI_2DF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXPERMDI_2DI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXPERMDI_4SF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXPERMDI_4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXPERMDI_8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXPERMDI_16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_CONCAT_2DF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_CONCAT_2DI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_SET_2DF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_SET_2DI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_SPLAT_2DF, RS6000_BTC_PURE) +RS6000_BUILTIN(VSX_BUILTIN_SPLAT_2DI, RS6000_BTC_PURE) +RS6000_BUILTIN(VSX_BUILTIN_XXMRGHW_4SF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXMRGHW_4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXMRGLW_4SF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXMRGLW_4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSLDWI_16QI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSLDWI_8HI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSLDWI_4SI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSLDWI_4SF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSLDWI_2DI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_XXSLDWI_2DF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VEC_INIT_V2DF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VEC_INIT_V2DI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VEC_SET_V2DF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VEC_SET_V2DI, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VEC_EXT_V2DF, RS6000_BTC_CONST) +RS6000_BUILTIN(VSX_BUILTIN_VEC_EXT_V2DI, RS6000_BTC_CONST) + +/* VSX overloaded builtins, add the overloaded functions not present in + Altivec. */ +RS6000_BUILTIN(VSX_BUILTIN_VEC_MUL, RS6000_BTC_MISC) +RS6000_BUILTIN_EQUATE(VSX_BUILTIN_OVERLOADED_FIRST, + VSX_BUILTIN_VEC_MUL) +RS6000_BUILTIN(VSX_BUILTIN_VEC_MSUB, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_VEC_NMADD, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUITLIN_VEC_NMSUB, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_VEC_DIV, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_VEC_XXMRGHW, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_VEC_XXMRGLW, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_VEC_XXPERMDI, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_VEC_XXSLDWI, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_VEC_XXSPLTD, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_VEC_XXSPLTW, RS6000_BTC_MISC) +RS6000_BUILTIN_EQUATE(VSX_BUILTIN_OVERLOADED_LAST, + VSX_BUILTIN_VEC_XXSPLTW) + +/* Combined VSX/Altivec builtins. */ +RS6000_BUILTIN(VECTOR_BUILTIN_FLOAT_V4SI_V4SF, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VECTOR_BUILTIN_UNSFLOAT_V4SI_V4SF, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VECTOR_BUILTIN_FIX_V4SF_V4SI, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(VECTOR_BUILTIN_FIXUNS_V4SF_V4SI, RS6000_BTC_FP_PURE) + +/* Power7 builtins, that aren't VSX instructions. */ +RS6000_BUILTIN(POWER7_BUILTIN_BPERMD, RS6000_BTC_CONST) + +/* Miscellaneous builtins. */ +RS6000_BUILTIN(RS6000_BUILTIN_RECIP, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(RS6000_BUILTIN_RECIPF, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(RS6000_BUILTIN_RSQRTF, RS6000_BTC_FP_PURE) +RS6000_BUILTIN(RS6000_BUILTIN_BSWAP_HI, RS6000_BTC_CONST) Index: gcc-4.3.4-20091019/gcc/config/rs6000/rs6000-c.c =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/rs6000-c.c 2009-10-19 13:39:52.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/config/rs6000/rs6000-c.c 2009-10-19 13:40:37.000000000 +0200 @@ -1,5 +1,5 @@ /* Subroutines for the C front end on the POWER and PowerPC architectures. - Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 + Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc. Contributed by Zack Weinberg <zack@codesourcery.com> @@ -85,12 +85,13 @@ rs6000_pragma_longcall (cpp_reader *pfil #define builtin_assert(TXT) cpp_assert (pfile, TXT) /* Keep the AltiVec keywords handy for fast comparisons. */ -static GTY(()) tree __vector_keyword; -static GTY(()) tree vector_keyword; -static GTY(()) tree __pixel_keyword; -static GTY(()) tree pixel_keyword; -static GTY(()) tree __bool_keyword; -static GTY(()) tree bool_keyword; +static tree GTY(()) __vector_keyword; +static tree GTY(()) vector_keyword; +static tree GTY(()) __pixel_keyword; +static tree GTY(()) pixel_keyword; +static tree GTY(()) __bool_keyword; +static tree GTY(()) bool_keyword; +static tree GTY(()) _Bool_keyword; /* Preserved across calls. */ static tree expand_bool_pixel; @@ -102,16 +103,16 @@ altivec_categorize_keyword (const cpp_to { cpp_hashnode *ident = tok->val.node; - if (ident == C_CPP_HASHNODE (vector_keyword) - || ident == C_CPP_HASHNODE (__vector_keyword)) + if (ident == C_CPP_HASHNODE (vector_keyword)) return C_CPP_HASHNODE (__vector_keyword); - if (ident == C_CPP_HASHNODE (pixel_keyword) - || ident == C_CPP_HASHNODE (__pixel_keyword)) + if (ident == C_CPP_HASHNODE (pixel_keyword)) return C_CPP_HASHNODE (__pixel_keyword); - if (ident == C_CPP_HASHNODE (bool_keyword) - || ident == C_CPP_HASHNODE (__bool_keyword)) + if (ident == C_CPP_HASHNODE (bool_keyword)) + return C_CPP_HASHNODE (__bool_keyword); + + if (ident == C_CPP_HASHNODE (_Bool_keyword)) return C_CPP_HASHNODE (__bool_keyword); return ident; @@ -144,6 +145,9 @@ init_vector_keywords (void) bool_keyword = get_identifier ("bool"); C_CPP_HASHNODE (bool_keyword)->flags |= NODE_CONDITIONAL; + + _Bool_keyword = get_identifier ("_Bool"); + C_CPP_HASHNODE (_Bool_keyword)->flags |= NODE_CONDITIONAL; } /* Called to decide whether a conditional macro should be expanded. @@ -158,12 +162,18 @@ rs6000_macro_to_expand (cpp_reader *pfil ident = altivec_categorize_keyword (tok); + if (ident != expand_this) + expand_this = NULL; + if (ident == C_CPP_HASHNODE (__vector_keyword)) { - tok = cpp_peek_token (pfile, 0); + int idx = 0; + do + tok = cpp_peek_token (pfile, idx++); + while (tok->type == CPP_PADDING); ident = altivec_categorize_keyword (tok); - if (ident == C_CPP_HASHNODE (__pixel_keyword)) + if (ident == C_CPP_HASHNODE (__pixel_keyword)) { expand_this = C_CPP_HASHNODE (__vector_keyword); expand_bool_pixel = __pixel_keyword; @@ -178,34 +188,55 @@ rs6000_macro_to_expand (cpp_reader *pfil enum rid rid_code = (enum rid)(ident->rid_code); if (ident->type == NT_MACRO) { - (void)cpp_get_token (pfile); - tok = cpp_peek_token (pfile, 0); + do + (void) cpp_get_token (pfile); + while (--idx > 0); + do + tok = cpp_peek_token (pfile, idx++); + while (tok->type == CPP_PADDING); ident = altivec_categorize_keyword (tok); - if (ident) + if (ident == C_CPP_HASHNODE (__pixel_keyword)) + { + expand_this = C_CPP_HASHNODE (__vector_keyword); + expand_bool_pixel = __pixel_keyword; + rid_code = RID_MAX; + } + else if (ident == C_CPP_HASHNODE (__bool_keyword)) + { + expand_this = C_CPP_HASHNODE (__vector_keyword); + expand_bool_pixel = __bool_keyword; + rid_code = RID_MAX; + } + else if (ident) rid_code = (enum rid)(ident->rid_code); } if (rid_code == RID_UNSIGNED || rid_code == RID_LONG || rid_code == RID_SHORT || rid_code == RID_SIGNED || rid_code == RID_INT || rid_code == RID_CHAR - || rid_code == RID_FLOAT) + || rid_code == RID_FLOAT + || (rid_code == RID_DOUBLE && TARGET_VSX)) { expand_this = C_CPP_HASHNODE (__vector_keyword); /* If the next keyword is bool or pixel, it will need to be expanded as well. */ - tok = cpp_peek_token (pfile, 1); + do + tok = cpp_peek_token (pfile, idx++); + while (tok->type == CPP_PADDING); ident = altivec_categorize_keyword (tok); - if (ident == C_CPP_HASHNODE (__pixel_keyword)) + if (ident == C_CPP_HASHNODE (__pixel_keyword)) expand_bool_pixel = __pixel_keyword; else if (ident == C_CPP_HASHNODE (__bool_keyword)) expand_bool_pixel = __bool_keyword; else { /* Try two tokens down, too. */ - tok = cpp_peek_token (pfile, 2); + do + tok = cpp_peek_token (pfile, idx++); + while (tok->type == CPP_PADDING); ident = altivec_categorize_keyword (tok); - if (ident == C_CPP_HASHNODE (__pixel_keyword)) + if (ident == C_CPP_HASHNODE (__pixel_keyword)) expand_bool_pixel = __pixel_keyword; else if (ident == C_CPP_HASHNODE (__bool_keyword)) expand_bool_pixel = __bool_keyword; @@ -254,6 +285,8 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi builtin_define ("_ARCH_PWR6X"); if (! TARGET_POWER && ! TARGET_POWER2 && ! TARGET_POWERPC) builtin_define ("_ARCH_COM"); + if (TARGET_POPCNTD) + builtin_define ("_ARCH_PWR7"); if (TARGET_ALTIVEC) { builtin_define ("__ALTIVEC__"); @@ -272,6 +305,7 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi builtin_define ("vector=vector"); builtin_define ("pixel=pixel"); builtin_define ("bool=bool"); + builtin_define ("_Bool=_Bool"); init_vector_keywords (); /* Enable context-sensitive macros. */ @@ -294,6 +328,43 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi /* Used by libstdc++. */ if (TARGET_NO_LWSYNC) builtin_define ("__NO_LWSYNC__"); + if (TARGET_VSX) + { + builtin_define ("__VSX__"); + + /* For the VSX builtin functions identical to Altivec functions, just map + the altivec builtin into the vsx version (the altivec functions + generate VSX code if -mvsx). */ + builtin_define ("__builtin_vsx_xxland=__builtin_vec_and"); + builtin_define ("__builtin_vsx_xxlandc=__builtin_vec_andc"); + builtin_define ("__builtin_vsx_xxlnor=__builtin_vec_nor"); + builtin_define ("__builtin_vsx_xxlor=__builtin_vec_or"); + builtin_define ("__builtin_vsx_xxlxor=__builtin_vec_xor"); + builtin_define ("__builtin_vsx_xxsel=__builtin_vec_sel"); + builtin_define ("__builtin_vsx_vperm=__builtin_vec_perm"); + + /* Also map the a and m versions of the multiply/add instructions to the + builtin for people blindly going off the instruction manual. */ + builtin_define ("__builtin_vsx_xvmaddadp=__builtin_vsx_xvmadddp"); + builtin_define ("__builtin_vsx_xvmaddmdp=__builtin_vsx_xvmadddp"); + builtin_define ("__builtin_vsx_xvmaddasp=__builtin_vsx_xvmaddsp"); + builtin_define ("__builtin_vsx_xvmaddmsp=__builtin_vsx_xvmaddsp"); + builtin_define ("__builtin_vsx_xvmsubadp=__builtin_vsx_xvmsubdp"); + builtin_define ("__builtin_vsx_xvmsubmdp=__builtin_vsx_xvmsubdp"); + builtin_define ("__builtin_vsx_xvmsubasp=__builtin_vsx_xvmsubsp"); + builtin_define ("__builtin_vsx_xvmsubmsp=__builtin_vsx_xvmsubsp"); + builtin_define ("__builtin_vsx_xvnmaddadp=__builtin_vsx_xvnmadddp"); + builtin_define ("__builtin_vsx_xvnmaddmdp=__builtin_vsx_xvnmadddp"); + builtin_define ("__builtin_vsx_xvnmaddasp=__builtin_vsx_xvnmaddsp"); + builtin_define ("__builtin_vsx_xvnmaddmsp=__builtin_vsx_xvnmaddsp"); + builtin_define ("__builtin_vsx_xvnmsubadp=__builtin_vsx_xvnmsubdp"); + builtin_define ("__builtin_vsx_xvnmsubmdp=__builtin_vsx_xvnmsubdp"); + builtin_define ("__builtin_vsx_xvnmsubasp=__builtin_vsx_xvnmsubsp"); + builtin_define ("__builtin_vsx_xvnmsubmsp=__builtin_vsx_xvnmsubsp"); + } + + /* Tell users they can use __builtin_bswap{16,64}. */ + builtin_define ("__HAVE_BSWAP__"); /* May be overridden by target configuration. */ RS6000_CPU_CPP_ENDIAN_BUILTINS(); @@ -323,6 +394,26 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi /* Let the compiled code know if 'f' class registers will not be available. */ if (TARGET_SOFT_FLOAT || !TARGET_FPRS) builtin_define ("__NO_FPRS__"); + + /* Generate defines for Xilinx FPU. */ + if (rs6000_xilinx_fpu) + { + builtin_define ("_XFPU"); + if (rs6000_single_float && ! rs6000_double_float) + { + if (rs6000_simple_fpu) + builtin_define ("_XFPU_SP_LITE"); + else + builtin_define ("_XFPU_SP_FULL"); + } + if (rs6000_double_float) + { + if (rs6000_simple_fpu) + builtin_define ("_XFPU_DP_LITE"); + else + builtin_define ("_XFPU_DP_FULL"); + } + } } @@ -337,7 +428,7 @@ struct altivec_builtin_types }; const struct altivec_builtin_types altivec_overloaded_builtins[] = { - /* Unary AltiVec builtins. */ + /* Unary AltiVec/VSX builtins. */ { ALTIVEC_BUILTIN_VEC_ABS, ALTIVEC_BUILTIN_ABS_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 }, { ALTIVEC_BUILTIN_VEC_ABS, ALTIVEC_BUILTIN_ABS_V8HI, @@ -346,6 +437,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0, 0 }, { ALTIVEC_BUILTIN_VEC_ABS, ALTIVEC_BUILTIN_ABS_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0, 0 }, + { ALTIVEC_BUILTIN_VEC_ABS, VSX_BUILTIN_XVABSDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_ABSS, ALTIVEC_BUILTIN_ABSS_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 }, { ALTIVEC_BUILTIN_VEC_ABSS, ALTIVEC_BUILTIN_ABSS_V8HI, @@ -354,8 +447,12 @@ const struct altivec_builtin_types altiv RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0, 0 }, { ALTIVEC_BUILTIN_VEC_CEIL, ALTIVEC_BUILTIN_VRFIP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0, 0 }, + { ALTIVEC_BUILTIN_VEC_CEIL, VSX_BUILTIN_XVRDPIP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_EXPTE, ALTIVEC_BUILTIN_VEXPTEFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0, 0 }, + { ALTIVEC_BUILTIN_VEC_FLOOR, VSX_BUILTIN_XVRDPIM, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_FLOOR, ALTIVEC_BUILTIN_VRFIM, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_LOGE, ALTIVEC_BUILTIN_VLOGEFP, @@ -388,6 +485,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_TRUNC, ALTIVEC_BUILTIN_VRFIZ, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0, 0 }, + { ALTIVEC_BUILTIN_VEC_TRUNC, VSX_BUILTIN_XVRDPIZ, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_UNPACKH, ALTIVEC_BUILTIN_VUPKHSB, RS6000_BTI_V8HI, RS6000_BTI_V16QI, 0, 0 }, { ALTIVEC_BUILTIN_VEC_UNPACKH, ALTIVEC_BUILTIN_VUPKHSB, @@ -433,7 +532,7 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_VUPKLSB, ALTIVEC_BUILTIN_VUPKLSB, RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V16QI, 0, 0 }, - /* Binary AltiVec builtins. */ + /* Binary AltiVec/VSX builtins. */ { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM, RS6000_BTI_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM, @@ -472,6 +571,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_ADD, VSX_BUILTIN_XVADDDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_VADDFP, ALTIVEC_BUILTIN_VADDFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_VADDUWM, ALTIVEC_BUILTIN_VADDUWM, @@ -615,6 +716,12 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, RS6000_BTI_V4SF, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_bool_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, + RS6000_BTI_V2DF, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, 0 }, @@ -663,6 +770,12 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, RS6000_BTI_V4SF, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_bool_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, + RS6000_BTI_V2DF, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, 0 }, @@ -744,6 +857,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_bool_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPEQ, VSX_BUILTIN_XVCMPEQDP, + RS6000_BTI_bool_V2DI, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_VCMPEQFP, ALTIVEC_BUILTIN_VCMPEQFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, @@ -764,6 +879,8 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_CMPGE, ALTIVEC_BUILTIN_VCMPGEFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPGE, VSX_BUILTIN_XVCMPGEDP, + RS6000_BTI_bool_V2DI, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTUB, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTSB, @@ -778,6 +895,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPGT, VSX_BUILTIN_XVCMPGTDP, + RS6000_BTI_bool_V2DI, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_VCMPGTFP, ALTIVEC_BUILTIN_VCMPGTFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_VCMPGTSW, ALTIVEC_BUILTIN_VCMPGTSW, @@ -806,6 +925,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPLE, ALTIVEC_BUILTIN_VCMPGEFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPLE, VSX_BUILTIN_XVCMPGEDP, + RS6000_BTI_bool_V2DI, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_CMPLT, ALTIVEC_BUILTIN_VCMPGTUB, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPLT, ALTIVEC_BUILTIN_VCMPGTSB, @@ -820,6 +941,12 @@ const struct altivec_builtin_types altiv RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPLT, ALTIVEC_BUILTIN_VCMPGTFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPLT, VSX_BUILTIN_XVCMPGTDP, + RS6000_BTI_bool_V2DI, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_COPYSIGN, VSX_BUILTIN_CPSGNDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_COPYSIGN, ALTIVEC_BUILTIN_COPYSIGN_V4SF, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_CTF, ALTIVEC_BUILTIN_VCFUX, RS6000_BTI_V4SF, RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, 0 }, { ALTIVEC_BUILTIN_VEC_CTF, ALTIVEC_BUILTIN_VCFSX, @@ -832,6 +959,10 @@ const struct altivec_builtin_types altiv RS6000_BTI_V4SI, RS6000_BTI_V4SF, RS6000_BTI_INTSI, 0 }, { ALTIVEC_BUILTIN_VEC_CTU, ALTIVEC_BUILTIN_VCTUXS, RS6000_BTI_unsigned_V4SI, RS6000_BTI_V4SF, RS6000_BTI_INTSI, 0 }, + { VSX_BUILTIN_VEC_DIV, VSX_BUILTIN_XVDIVSP, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { VSX_BUILTIN_VEC_DIV, VSX_BUILTIN_XVDIVDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX, @@ -1166,6 +1297,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_MAX, ALTIVEC_BUILTIN_VMAXFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_MAX, VSX_BUILTIN_XVMAXDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_VMAXFP, ALTIVEC_BUILTIN_VMAXFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_VMAXSW, ALTIVEC_BUILTIN_VMAXSW, @@ -1342,6 +1475,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_MIN, ALTIVEC_BUILTIN_VMINFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_MIN, VSX_BUILTIN_XVMINDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_VMINFP, ALTIVEC_BUILTIN_VMINFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_VMINSW, ALTIVEC_BUILTIN_VMINSW, @@ -1392,6 +1527,10 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_VMINUB, ALTIVEC_BUILTIN_VMINUB, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_bool_V16QI, 0 }, + { VSX_BUILTIN_VEC_MUL, VSX_BUILTIN_XVMULSP, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { VSX_BUILTIN_VEC_MUL, VSX_BUILTIN_XVMULDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULEUB, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULESB, @@ -1424,9 +1563,15 @@ const struct altivec_builtin_types altiv RS6000_BTI_V8HI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULOUB, ALTIVEC_BUILTIN_VMULOUB, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, + { ALTIVEC_BUILTIN_VEC_NEARBYINT, VSX_BUILTIN_XVRDPI, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0, 0 }, + { ALTIVEC_BUILTIN_VEC_NEARBYINT, VSX_BUILTIN_XVRSPI, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, @@ -1451,6 +1596,12 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, RS6000_BTI_V4SF, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_bool_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, + RS6000_BTI_V2DF, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, 0 }, @@ -1546,6 +1697,10 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V8HI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VPKSHUS, ALTIVEC_BUILTIN_VPKSHUS, RS6000_BTI_unsigned_V16QI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 }, + { ALTIVEC_BUILTIN_VEC_RINT, VSX_BUILTIN_XVRDPIC, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0, 0 }, + { ALTIVEC_BUILTIN_VEC_RINT, VSX_BUILTIN_XVRSPIC, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_RL, ALTIVEC_BUILTIN_VRLB, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_RL, ALTIVEC_BUILTIN_VRLB, @@ -1582,6 +1737,10 @@ const struct altivec_builtin_types altiv RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_SL, ALTIVEC_BUILTIN_VSLW, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_SQRT, VSX_BUILTIN_XVSQRTDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0, 0 }, + { ALTIVEC_BUILTIN_VEC_SQRT, VSX_BUILTIN_XVSQRTSP, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_VSLW, ALTIVEC_BUILTIN_VSLW, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VSLW, ALTIVEC_BUILTIN_VSLW, @@ -1908,6 +2067,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_SUB, ALTIVEC_BUILTIN_VSUBFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_SUB, VSX_BUILTIN_XVSUBDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_VSUBFP, ALTIVEC_BUILTIN_VSUBFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_VSUBUWM, ALTIVEC_BUILTIN_VSUBUWM, @@ -2067,6 +2228,12 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, RS6000_BTI_V4SF, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_bool_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, + RS6000_BTI_V2DF, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, 0 }, @@ -2109,7 +2276,7 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, - /* Ternary AltiVec builtins. */ + /* Ternary AltiVec/VSX builtins. */ { ALTIVEC_BUILTIN_VEC_DST, ALTIVEC_BUILTIN_DST, RS6000_BTI_void, ~RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, RS6000_BTI_INTSI }, { ALTIVEC_BUILTIN_VEC_DST, ALTIVEC_BUILTIN_DST, @@ -2272,6 +2439,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_void, ~RS6000_BTI_float, RS6000_BTI_INTSI, RS6000_BTI_INTSI }, { ALTIVEC_BUILTIN_VEC_MADD, ALTIVEC_BUILTIN_VMADDFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, + { ALTIVEC_BUILTIN_VEC_MADD, VSX_BUILTIN_XVMADDDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF }, { ALTIVEC_BUILTIN_VEC_MADDS, ALTIVEC_BUILTIN_VMHADDSHS, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI }, { ALTIVEC_BUILTIN_VEC_MLADD, ALTIVEC_BUILTIN_VMLADDUHM, @@ -2284,6 +2453,10 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI }, { ALTIVEC_BUILTIN_VEC_MRADDS, ALTIVEC_BUILTIN_VMHRADDSHS, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI }, + { VSX_BUILTIN_VEC_MSUB, VSX_BUILTIN_XVMSUBSP, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, + { VSX_BUILTIN_VEC_MSUB, VSX_BUILTIN_XVMSUBDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF }, { ALTIVEC_BUILTIN_VEC_MSUM, ALTIVEC_BUILTIN_VMSUMUBM, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V4SI }, { ALTIVEC_BUILTIN_VEC_MSUM, ALTIVEC_BUILTIN_VMSUMMBM, @@ -2308,8 +2481,18 @@ const struct altivec_builtin_types altiv RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V4SI }, { ALTIVEC_BUILTIN_VEC_VMSUMUHS, ALTIVEC_BUILTIN_VMSUMUHS, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V4SI }, + { VSX_BUILTIN_VEC_NMADD, VSX_BUILTIN_XVNMADDSP, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, + { VSX_BUILTIN_VEC_NMADD, VSX_BUILTIN_XVNMADDDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF }, { ALTIVEC_BUILTIN_VEC_NMSUB, ALTIVEC_BUILTIN_VNMSUBFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, + { ALTIVEC_BUILTIN_VEC_NMSUB, VSX_BUILTIN_XVNMSUBDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF }, + { ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_2DF, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_unsigned_V16QI }, + { ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_2DI, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V16QI }, { ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_unsigned_V16QI }, { ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_4SI, @@ -2336,11 +2519,29 @@ const struct altivec_builtin_types altiv RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI }, { ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI }, + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_bool_V2DI }, + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_unsigned_V2DI }, + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DI }, + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF }, + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI }, + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI }, + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI }, { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_bool_V4SI }, { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_unsigned_V4SI }, { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SI }, + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI }, { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI }, @@ -2708,6 +2909,54 @@ const struct altivec_builtin_types altiv RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V16QI }, { ALTIVEC_BUILTIN_VEC_STVRXL, ALTIVEC_BUILTIN_STVRXL, RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI }, + { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_16QI, + RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, + RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_8HI, + RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, + RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_4SI, + RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_2DI, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_4SF, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_2DF, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DF, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DI, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_4SF, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_4SI, + RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_8HI, + RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, + RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_16QI, + RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, + RS6000_BTI_NOT_OPAQUE }, /* Predicates. */ { ALTIVEC_BUILTIN_VCMPGT_P, ALTIVEC_BUILTIN_VCMPGTUB_P, @@ -2748,6 +2997,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SI, RS6000_BTI_V4SI }, { ALTIVEC_BUILTIN_VCMPGT_P, ALTIVEC_BUILTIN_VCMPGTFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, + { ALTIVEC_BUILTIN_VCMPGT_P, VSX_BUILTIN_XVCMPGTDP_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DF, RS6000_BTI_V2DF }, { ALTIVEC_BUILTIN_VCMPEQ_P, ALTIVEC_BUILTIN_VCMPEQUB_P, @@ -2796,6 +3047,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI }, { ALTIVEC_BUILTIN_VCMPEQ_P, ALTIVEC_BUILTIN_VCMPEQFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, + { ALTIVEC_BUILTIN_VCMPEQ_P, VSX_BUILTIN_XVCMPEQDP_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DF, RS6000_BTI_V2DF }, /* cmpge is the same as cmpgt for all cases except floating point. @@ -2839,8 +3092,10 @@ const struct altivec_builtin_types altiv RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SI, RS6000_BTI_V4SI }, { ALTIVEC_BUILTIN_VCMPGE_P, ALTIVEC_BUILTIN_VCMPGEFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, + { ALTIVEC_BUILTIN_VCMPGE_P, VSX_BUILTIN_XVCMPGEDP_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DF, RS6000_BTI_V2DF }, - { 0, 0, 0, 0, 0, 0 } + { (enum rs6000_builtins) 0, (enum rs6000_builtins) 0, 0, 0, 0, 0 } }; @@ -2868,6 +3123,8 @@ rs6000_builtin_type_compatible (tree t, { tree builtin_type; builtin_type = rs6000_builtin_type (id); + if (t == error_mark_node) + return false; if (INTEGRAL_TYPE_P (t) && INTEGRAL_TYPE_P (builtin_type)) return true; else @@ -2949,14 +3206,17 @@ altivec_build_resolved_builtin (tree *ar tree altivec_resolve_overloaded_builtin (tree fndecl, tree arglist) { + unsigned int nargs = list_length (arglist); unsigned int fcode = DECL_FUNCTION_CODE (fndecl); tree fnargs = TYPE_ARG_TYPES (TREE_TYPE (fndecl)); tree types[3], args[3]; const struct altivec_builtin_types *desc; - int n; + unsigned int n; - if (fcode < ALTIVEC_BUILTIN_OVERLOADED_FIRST - || fcode > ALTIVEC_BUILTIN_OVERLOADED_LAST) + if ((fcode < ALTIVEC_BUILTIN_OVERLOADED_FIRST + || fcode > ALTIVEC_BUILTIN_OVERLOADED_LAST) + && (fcode < VSX_BUILTIN_OVERLOADED_FIRST + || fcode > VSX_BUILTIN_OVERLOADED_LAST)) return NULL_TREE; /* For now treat vec_splats and vec_promote as the same. */ @@ -2970,29 +3230,23 @@ altivec_resolve_overloaded_builtin (tree VEC(constructor_elt,gc) *vec; const char *name = fcode == ALTIVEC_BUILTIN_VEC_SPLATS ? "vec_splats": "vec_promote"; - if (!arglist) + if (nargs == 0) { error ("%s only accepts %d arguments", name, (fcode == ALTIVEC_BUILTIN_VEC_PROMOTE)+1 ); return error_mark_node; } - if (fcode == ALTIVEC_BUILTIN_VEC_SPLATS && TREE_CHAIN (arglist)) + if (fcode == ALTIVEC_BUILTIN_VEC_SPLATS && nargs != 1) { error ("%s only accepts 1 argument", name); return error_mark_node; } - if (fcode == ALTIVEC_BUILTIN_VEC_PROMOTE && !TREE_CHAIN (arglist)) + if (fcode == ALTIVEC_BUILTIN_VEC_PROMOTE && nargs != 2) { error ("%s only accepts 2 arguments", name); return error_mark_node; } /* Ignore promote's element argument. */ if (fcode == ALTIVEC_BUILTIN_VEC_PROMOTE - && TREE_CHAIN (TREE_CHAIN (arglist))) - { - error ("%s only accepts 2 arguments", name); - return error_mark_node; - } - if (fcode == ALTIVEC_BUILTIN_VEC_PROMOTE && !INTEGRAL_TYPE_P (TREE_TYPE (TREE_VALUE (TREE_CHAIN (arglist))))) goto bad; @@ -3002,11 +3256,12 @@ altivec_resolve_overloaded_builtin (tree && !INTEGRAL_TYPE_P (type)) goto bad; unsigned_p = TYPE_UNSIGNED (type); - if (type == long_long_unsigned_type_node - || type == long_long_integer_type_node) - goto bad; switch (TYPE_MODE (type)) { + case DImode: + type = (unsigned_p ? unsigned_V2DI_type_node : V2DI_type_node); + size = 2; + break; case SImode: type = (unsigned_p ? unsigned_V4SI_type_node : V4SI_type_node); size = 4; @@ -3020,6 +3275,7 @@ altivec_resolve_overloaded_builtin (tree size = 16; break; case SFmode: type = V4SF_type_node; size = 4; break; + case DFmode: type = V2DF_type_node; size = 2; break; default: goto bad; } @@ -3036,7 +3292,8 @@ altivec_resolve_overloaded_builtin (tree return build_constructor (type, vec); } - /* For now use pointer tricks to do the extaction. */ + /* For now use pointer tricks to do the extaction, unless we are on VSX + extracting a double from a constant offset. */ if (fcode == ALTIVEC_BUILTIN_VEC_EXTRACT) { tree arg1; @@ -3045,10 +3302,10 @@ altivec_resolve_overloaded_builtin (tree tree arg1_inner_type; tree decl, stmt; tree innerptrtype; + enum machine_mode mode; /* No second argument. */ - if (!arglist || !TREE_CHAIN (arglist) - || TREE_CHAIN (TREE_CHAIN (arglist))) + if (nargs != 2) { error ("vec_extract only accepts 2 arguments"); return error_mark_node; @@ -3062,6 +3319,25 @@ altivec_resolve_overloaded_builtin (tree goto bad; if (!INTEGRAL_TYPE_P (TREE_TYPE (arg2))) goto bad; + + /* If we can use the VSX xxpermdi instruction, use that for extract. */ + mode = TYPE_MODE (arg1_type); + if ((mode == V2DFmode || mode == V2DImode) && VECTOR_UNIT_VSX_P (mode) + && TREE_CODE (arg2) == INTEGER_CST + && TREE_INT_CST_HIGH (arg2) == 0 + && (TREE_INT_CST_LOW (arg2) == 0 || TREE_INT_CST_LOW (arg2) == 1)) + { + tree call = NULL_TREE; + + if (mode == V2DFmode) + call = rs6000_builtin_decls[VSX_BUILTIN_VEC_EXT_V2DF]; + else if (mode == V2DImode) + call = rs6000_builtin_decls[VSX_BUILTIN_VEC_EXT_V2DI]; + + if (call) + return build_call_expr (call, 2, arg1, arg2); + } + /* Build *(((arg1_inner_type*)&(vector type){arg1})+arg2). */ arg1_inner_type = TREE_TYPE (arg1_type); arg2 = build_binary_op (BIT_AND_EXPR, arg2, @@ -3078,7 +3354,6 @@ altivec_resolve_overloaded_builtin (tree DECL_INITIAL (decl) = arg1; stmt = build1 (DECL_EXPR, arg1_type, decl); TREE_ADDRESSABLE (decl) = 1; - SET_EXPR_LOCATION (stmt, input_location); stmt = build1 (COMPOUND_LITERAL_EXPR, arg1_type, stmt); innerptrtype = build_pointer_type (arg1_inner_type); @@ -3091,7 +3366,8 @@ altivec_resolve_overloaded_builtin (tree return stmt; } - /* For now use pointer tricks to do the insertation. */ + /* For now use pointer tricks to do the insertation, unless we are on VSX + inserting a double to a constant offset.. */ if (fcode == ALTIVEC_BUILTIN_VEC_INSERT) { tree arg0; @@ -3101,11 +3377,10 @@ altivec_resolve_overloaded_builtin (tree tree arg1_inner_type; tree decl, stmt; tree innerptrtype; - + enum machine_mode mode; + /* No second or third arguments. */ - if (!arglist || !TREE_CHAIN (arglist) - || !TREE_CHAIN (TREE_CHAIN (arglist)) - || TREE_CHAIN (TREE_CHAIN (TREE_CHAIN (arglist)))) + if (nargs != 3) { error ("vec_insert only accepts 3 arguments"); return error_mark_node; @@ -3120,6 +3395,27 @@ altivec_resolve_overloaded_builtin (tree goto bad; if (!INTEGRAL_TYPE_P (TREE_TYPE (arg2))) goto bad; + + /* If we can use the VSX xxpermdi instruction, use that for insert. */ + mode = TYPE_MODE (arg1_type); + if ((mode == V2DFmode || mode == V2DImode) && VECTOR_UNIT_VSX_P (mode) + && TREE_CODE (arg2) == INTEGER_CST + && TREE_INT_CST_HIGH (arg2) == 0 + && (TREE_INT_CST_LOW (arg2) == 0 || TREE_INT_CST_LOW (arg2) == 1)) + { + tree call = NULL_TREE; + + if (mode == V2DFmode) + call = rs6000_builtin_decls[VSX_BUILTIN_VEC_SET_V2DF]; + else if (mode == V2DImode) + call = rs6000_builtin_decls[VSX_BUILTIN_VEC_SET_V2DI]; + + /* Note, __builtin_vec_insert_<xxx> has vector and scalar types + reversed. */ + if (call) + return build_call_expr (call, 3, arg1, arg0, arg2); + } + /* Build *(((arg1_inner_type*)&(vector type){arg1})+arg2) = arg0. */ arg1_inner_type = TREE_TYPE (arg1_type); arg2 = build_binary_op (BIT_AND_EXPR, arg2, @@ -3136,7 +3432,6 @@ altivec_resolve_overloaded_builtin (tree DECL_INITIAL (decl) = arg1; stmt = build1 (DECL_EXPR, arg1_type, decl); TREE_ADDRESSABLE (decl) = 1; - SET_EXPR_LOCATION (stmt, input_location); stmt = build1 (COMPOUND_LITERAL_EXPR, arg1_type, stmt); innerptrtype = build_pointer_type (arg1_inner_type); @@ -3235,4 +3530,3 @@ altivec_resolve_overloaded_builtin (tree error ("invalid parameter combination for AltiVec intrinsic"); return error_mark_node; } - Index: gcc-4.3.4-20091019/gcc/config/rs6000/rs6000.md =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/rs6000.md 2009-10-19 13:39:52.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/config/rs6000/rs6000.md 2009-10-19 13:40:37.000000000 +0200 @@ -1,6 +1,6 @@ ;; Machine description for IBM RISC System 6000 (POWER) for GNU C compiler ;; Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, -;; 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007 +;; 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 ;; Free Software Foundation, Inc. ;; Contributed by Richard Kenner (kenner@vlsi1.ultra.nyu.edu) @@ -99,6 +99,9 @@ (UNSPEC_DLMZB_CR 46) (UNSPEC_DLMZB_STRLEN 47) (UNSPEC_RSQRT 48) + (UNSPEC_TOCREL 49) + (UNSPEC_MACHOPIC_OFFSET 50) + (UNSPEC_BPERM 51) ]) ;; @@ -117,6 +120,9 @@ (define_attr "type" "integer,two,three,load,load_ext,load_ext_u,load_ext_ux,load_ux,load_u,store,store_ux,store_u,fpload,fpload_ux,fpload_u,fpstore,fpstore_ux,fpstore_u,vecload,vecstore,imul,imul2,imul3,lmul,idiv,ldiv,insert_word,branch,cmp,fast_compare,compare,var_delayed_compare,delayed_compare,imul_compare,lmul_compare,fpcompare,cr_logical,delayed_cr,mfcr,mfcrf,mtcr,mfjmpr,mtjmpr,fp,fpsimple,dmul,sdiv,ddiv,ssqrt,dsqrt,jmpreg,brinc,vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm,vecfloat,vecfdiv,isync,sync,load_l,store_c,shift,trap,insert_dword,var_shift_rotate,cntlz,exts,mffgpr,mftgpr" (const_string "integer")) +;; Define floating point instruction sub-types for use with Xfpu.md +(define_attr "fp_type" "fp_default,fp_addsub_s,fp_addsub_d,fp_mul_s,fp_mul_d,fp_div_s,fp_div_d,fp_maddsub_s,fp_maddsub_d,fp_sqrt_s,fp_sqrt_d" (const_string "fp_default")) + ;; Length (in bytes). ; '(pc)' in the following doesn't include the instruction itself; it is ; calculated as if the instruction had zero size. @@ -133,17 +139,16 @@ ;; Processor type -- this attribute must exactly match the processor_type ;; enumeration in rs6000.h. -(define_attr "cpu" "rios1,rios2,rs64a,mpccore,ppc403,ppc405,ppc440,ppc601,ppc603,ppc604,ppc604e,ppc620,ppc630,ppc750,ppc7400,ppc7450,ppc8540,power4,power5,power6,cell" +(define_attr "cpu" "rios1,rios2,rs64a,mpccore,ppc403,ppc405,ppc440,ppc601,ppc603,ppc604,ppc604e,ppc620,ppc630,ppc750,ppc7400,ppc7450,ppc8540,ppce300c2,ppce300c3,ppce500mc,power4,power5,power6,power7,cell" (const (symbol_ref "rs6000_cpu_attr"))) ;; If this instruction is microcoded on the CELL processor ; The default for load extended, the recorded instructions and rotate/shifts by a variable is always microcoded (define_attr "cell_micro" "not,conditional,always" - (if_then_else (eq_attr "type" "compare,delayed_compare,imul_compare,lmul_compare,load_ext,load_ext_ux,var_shift_rotate,var_delayed_compare") - (const_string "always") - (const_string "not"))) - + (if_then_else (eq_attr "type" "compare,delayed_compare,imul_compare,lmul_compare,load_ext,load_ext_ux,var_shift_rotate,var_delayed_compare") + (const_string "always") + (const_string "not"))) (automata_option "ndfa") @@ -158,10 +163,14 @@ (include "7xx.md") (include "7450.md") (include "8540.md") +(include "e300c2c3.md") +(include "e500mc.md") (include "power4.md") (include "power5.md") (include "power6.md") +(include "power7.md") (include "cell.md") +(include "xfpu.md") (include "predicates.md") (include "constraints.md") @@ -192,8 +201,11 @@ (define_mode_iterator P [(SI "TARGET_32BIT") (DI "TARGET_64BIT")]) ; Any hardware-supported floating-point mode -(define_mode_iterator FP [(SF "TARGET_HARD_FLOAT") - (DF "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)") +(define_mode_iterator FP [ + (SF "TARGET_HARD_FLOAT + && ((TARGET_FPRS && TARGET_SINGLE_FLOAT) || TARGET_E500_SINGLE)") + (DF "TARGET_HARD_FLOAT + && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)") (TF "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE) @@ -208,6 +220,19 @@ ; DImode bits (define_mode_attr dbits [(QI "56") (HI "48") (SI "32")]) +;; ISEL/ISEL64 target selection +(define_mode_attr sel [(SI "") (DI "64")]) + +;; Suffix for reload patterns +(define_mode_attr ptrsize [(SI "32bit") + (DI "64bit")]) + +(define_mode_attr tptrsize [(SI "TARGET_32BIT") + (DI "TARGET_64BIT")]) + +(define_mode_attr mptrsize [(SI "si") + (DI "di")]) + ;; Start with fixed-point load and store insns. Here we put only the more ;; complex forms. Basic data transfer is done later. @@ -510,7 +535,7 @@ "@ {andil.|andi.} %2,%1,0xff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -536,7 +561,7 @@ "@ {andil.|andi.} %0,%1,0xff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -677,7 +702,7 @@ "@ {andil.|andi.} %2,%1,0xff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -703,7 +728,7 @@ "@ {andil.|andi.} %0,%1,0xff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -846,7 +871,7 @@ "@ {andil.|andi.} %2,%1,0xffff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -872,7 +897,7 @@ "@ {andil.|andi.} %0,%1,0xffff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -949,7 +974,7 @@ [(set_attr "type" "compare") (set_attr "length" "4,8")]) -;; IBM 405 and 440 half-word multiplication operations. +;; IBM 405, 440 and 464 half-word multiplication operations. (define_insn "*macchwc" [(set (match_operand:CC 3 "cc_reg_operand" "=x") @@ -1413,7 +1438,7 @@ "mullhwu %0, %1, %2" [(set_attr "type" "imul3")]) -;; IBM 405 and 440 string-search dlmzb instruction support. +;; IBM 405, 440 and 464 string-search dlmzb instruction support. (define_insn "dlmzb" [(set (match_operand:CC 3 "cc_reg_operand" "=x") (unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r") @@ -1660,7 +1685,7 @@ "@ nor. %2,%1,%1 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -1686,7 +1711,7 @@ "@ nor. %0,%1,%1 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -2211,10 +2236,22 @@ "TARGET_POPCNTB" "popcntb %0,%1") +(define_insn "popcntwsi2" + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (popcount:SI (match_operand:SI 1 "gpc_reg_operand" "r")))] + "TARGET_POPCNTD" + "popcntw %0,%1") + +(define_insn "popcntddi2" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (popcount:DI (match_operand:DI 1 "gpc_reg_operand" "r")))] + "TARGET_POPCNTD && TARGET_POWERPC64" + "popcntd %0,%1") + (define_expand "popcount<mode>2" [(set (match_operand:GPR 0 "gpc_reg_operand" "") (popcount:GPR (match_operand:GPR 1 "gpc_reg_operand" "")))] - "TARGET_POPCNTB" + "TARGET_POPCNTB || TARGET_POPCNTD" { rs6000_emit_popcount (operands[0], operands[1]); DONE; @@ -2229,15 +2266,102 @@ DONE; }) -(define_insn "bswapsi2" +;; Since the hardware zeros the upper part of the register, save generating the +;; AND immediate if we are converting to unsigned +(define_insn "*bswaphi2_extenddi" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (zero_extend:DI + (bswap:HI (match_operand:HI 1 "memory_operand" "Z"))))] + "TARGET_POWERPC64" + "lhbrx %0,%y1" + [(set_attr "length" "4") + (set_attr "type" "load")]) + +(define_insn "*bswaphi2_extendsi" + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (zero_extend:SI + (bswap:HI (match_operand:HI 1 "memory_operand" "Z"))))] + "TARGET_POWERPC" + "lhbrx %0,%y1" + [(set_attr "length" "4") + (set_attr "type" "load")]) + +(define_expand "bswaphi2" + [(parallel [(set (match_operand:HI 0 "reg_or_mem_operand" "") + (bswap:HI + (match_operand:HI 1 "reg_or_mem_operand" ""))) + (clobber (match_scratch:SI 2 ""))])] + "" +{ + if (!REG_P (operands[0]) && !REG_P (operands[1])) + operands[1] = force_reg (HImode, operands[1]); +}) + +(define_insn "bswaphi2_internal" + [(set (match_operand:HI 0 "reg_or_mem_operand" "=r,Z,&r") + (bswap:HI + (match_operand:HI 1 "reg_or_mem_operand" "Z,r,r"))) + (clobber (match_scratch:SI 2 "=X,X,&r"))] + "TARGET_POWERPC" + "@ + lhbrx %0,%y1 + sthbrx %1,%y0 + #" + [(set_attr "length" "4,4,12") + (set_attr "type" "load,store,*")]) + +(define_split + [(set (match_operand:HI 0 "gpc_reg_operand" "") + (bswap:HI (match_operand:HI 1 "gpc_reg_operand" ""))) + (clobber (match_operand:SI 2 "gpc_reg_operand" ""))] + "TARGET_POWERPC && reload_completed" + [(set (match_dup 3) + (zero_extract:SI (match_dup 4) + (const_int 8) + (const_int 16))) + (set (match_dup 2) + (and:SI (ashift:SI (match_dup 4) + (const_int 8)) + (const_int 65280))) ;; 0xff00 + (set (match_dup 3) + (ior:SI (match_dup 3) + (match_dup 2)))] + " +{ + operands[3] = simplify_gen_subreg (SImode, operands[0], HImode, 0); + operands[4] = simplify_gen_subreg (SImode, operands[1], HImode, 0); +}") + +(define_insn "*bswapsi2_extenddi" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (zero_extend:DI + (bswap:SI (match_operand:SI 1 "memory_operand" "Z"))))] + "TARGET_POWERPC64" + "lwbrx %0,%y1" + [(set_attr "length" "4") + (set_attr "type" "load")]) + +(define_expand "bswapsi2" + [(set (match_operand:SI 0 "reg_or_mem_operand" "") + (bswap:SI + (match_operand:SI 1 "reg_or_mem_operand" "")))] + "" +{ + if (!REG_P (operands[0]) && !REG_P (operands[1])) + operands[1] = force_reg (SImode, operands[1]); +}) + +(define_insn "*bswapsi2_internal" [(set (match_operand:SI 0 "reg_or_mem_operand" "=r,Z,&r") - (bswap:SI (match_operand:SI 1 "reg_or_mem_operand" "Z,r,r")))] + (bswap:SI + (match_operand:SI 1 "reg_or_mem_operand" "Z,r,r")))] "" "@ {lbrx|lwbrx} %0,%y1 {stbrx|stwbrx} %1,%y0 #" - [(set_attr "length" "4,4,12")]) + [(set_attr "length" "4,4,12") + (set_attr "type" "load,store,*")]) (define_split [(set (match_operand:SI 0 "gpc_reg_operand" "") @@ -2256,6 +2380,300 @@ (const_int 16)))] "") +(define_expand "bswapdi2" + [(parallel [(set (match_operand:DI 0 "reg_or_mem_operand" "") + (bswap:DI + (match_operand:DI 1 "reg_or_mem_operand" ""))) + (clobber (match_scratch:DI 2 "")) + (clobber (match_scratch:DI 3 "")) + (clobber (match_scratch:DI 4 ""))])] + "" +{ + if (!REG_P (operands[0]) && !REG_P (operands[1])) + operands[1] = force_reg (DImode, operands[1]); + + if (!TARGET_POWERPC64) + { + /* 32-bit mode needs fewer scratch registers, but 32-bit addressing mode + that uses 64-bit registers needs the same scratch registers as 64-bit + mode. */ + emit_insn (gen_bswapdi2_32bit (operands[0], operands[1])); + DONE; + } +}) + +;; Power7/cell has ldbrx/stdbrx, so use it directly +(define_insn "*bswapdi2_ldbrx" + [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,??&r") + (bswap:DI (match_operand:DI 1 "reg_or_mem_operand" "Z,r,r"))) + (clobber (match_scratch:DI 2 "=X,X,&r")) + (clobber (match_scratch:DI 3 "=X,X,&r")) + (clobber (match_scratch:DI 4 "=X,X,&r"))] + "TARGET_POWERPC64 && TARGET_LDBRX + && (REG_P (operands[0]) || REG_P (operands[1]))" + "@ + ldbrx %0,%y1 + stdbrx %1,%y0 + #" + [(set_attr "length" "4,4,36") + (set_attr "type" "load,store,*")]) + +;; Non-power7/cell, fall back to use lwbrx/stwbrx +(define_insn "*bswapdi2_64bit" + [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,??&r") + (bswap:DI (match_operand:DI 1 "reg_or_mem_operand" "Z,r,r"))) + (clobber (match_scratch:DI 2 "=&b,&b,&r")) + (clobber (match_scratch:DI 3 "=&r,&r,&r")) + (clobber (match_scratch:DI 4 "=&r,X,&r"))] + "TARGET_POWERPC64 && !TARGET_LDBRX + && (REG_P (operands[0]) || REG_P (operands[1]))" + "#" + [(set_attr "length" "16,12,36")]) + +(define_split + [(set (match_operand:DI 0 "gpc_reg_operand" "") + (bswap:DI (match_operand:DI 1 "indexed_or_indirect_operand" ""))) + (clobber (match_operand:DI 2 "gpc_reg_operand" "")) + (clobber (match_operand:DI 3 "gpc_reg_operand" "")) + (clobber (match_operand:DI 4 "gpc_reg_operand" ""))] + "TARGET_POWERPC64 && !TARGET_LDBRX && reload_completed" + [(const_int 0)] + " +{ + rtx dest = operands[0]; + rtx src = operands[1]; + rtx op2 = operands[2]; + rtx op3 = operands[3]; + rtx op4 = operands[4]; + rtx op3_32 = simplify_gen_subreg (SImode, op3, DImode, 4); + rtx op4_32 = simplify_gen_subreg (SImode, op4, DImode, 4); + rtx addr1; + rtx addr2; + rtx word_high; + rtx word_low; + + addr1 = XEXP (src, 0); + if (GET_CODE (addr1) == PLUS) + { + emit_insn (gen_add3_insn (op2, XEXP (addr1, 0), GEN_INT (4))); + addr2 = gen_rtx_PLUS (DImode, op2, XEXP (addr1, 1)); + } + else + { + emit_move_insn (op2, GEN_INT (4)); + addr2 = gen_rtx_PLUS (DImode, op2, addr1); + } + + if (BYTES_BIG_ENDIAN) + { + word_high = change_address (src, SImode, addr1); + word_low = change_address (src, SImode, addr2); + } + else + { + word_high = change_address (src, SImode, addr2); + word_low = change_address (src, SImode, addr1); + } + + emit_insn (gen_bswapsi2 (op3_32, word_low)); + emit_insn (gen_bswapsi2 (op4_32, word_high)); + emit_insn (gen_ashldi3 (dest, op3, GEN_INT (32))); + emit_insn (gen_iordi3 (dest, dest, op4)); +}") + +(define_split + [(set (match_operand:DI 0 "indexed_or_indirect_operand" "") + (bswap:DI (match_operand:DI 1 "gpc_reg_operand" ""))) + (clobber (match_operand:DI 2 "gpc_reg_operand" "")) + (clobber (match_operand:DI 3 "gpc_reg_operand" "")) + (clobber (match_operand:DI 4 "" ""))] + "TARGET_POWERPC64 && !TARGET_LDBRX && reload_completed" + [(const_int 0)] + " +{ + rtx dest = operands[0]; + rtx src = operands[1]; + rtx op2 = operands[2]; + rtx op3 = operands[3]; + rtx src_si = simplify_gen_subreg (SImode, src, DImode, 4); + rtx op3_si = simplify_gen_subreg (SImode, op3, DImode, 4); + rtx addr1; + rtx addr2; + rtx word_high; + rtx word_low; + + addr1 = XEXP (dest, 0); + if (GET_CODE (addr1) == PLUS) + { + emit_insn (gen_add3_insn (op2, XEXP (addr1, 0), GEN_INT (4))); + addr2 = gen_rtx_PLUS (DImode, op2, XEXP (addr1, 1)); + } + else + { + emit_move_insn (op2, GEN_INT (4)); + addr2 = gen_rtx_PLUS (DImode, op2, addr1); + } + + emit_insn (gen_lshrdi3 (op3, src, GEN_INT (32))); + if (BYTES_BIG_ENDIAN) + { + word_high = change_address (dest, SImode, addr1); + word_low = change_address (dest, SImode, addr2); + emit_insn (gen_bswapsi2 (word_high, src_si)); + emit_insn (gen_bswapsi2 (word_low, op3_si)); + } + else + { + word_high = change_address (dest, SImode, addr2); + word_low = change_address (dest, SImode, addr1); + emit_insn (gen_bswapsi2 (word_low, src_si)); + emit_insn (gen_bswapsi2 (word_high, op3_si)); + } +}") + +(define_split + [(set (match_operand:DI 0 "gpc_reg_operand" "") + (bswap:DI (match_operand:DI 1 "gpc_reg_operand" ""))) + (clobber (match_operand:DI 2 "gpc_reg_operand" "")) + (clobber (match_operand:DI 3 "gpc_reg_operand" "")) + (clobber (match_operand:DI 4 "" ""))] + "TARGET_POWERPC64 && reload_completed" + [(const_int 0)] + " +{ + rtx dest = operands[0]; + rtx src = operands[1]; + rtx op2 = operands[2]; + rtx op3 = operands[3]; + rtx dest_si = simplify_gen_subreg (SImode, dest, DImode, 4); + rtx src_si = simplify_gen_subreg (SImode, src, DImode, 4); + rtx op2_si = simplify_gen_subreg (SImode, op2, DImode, 4); + rtx op3_si = simplify_gen_subreg (SImode, op3, DImode, 4); + + emit_insn (gen_lshrdi3 (op2, src, GEN_INT (32))); + emit_insn (gen_bswapsi2 (dest_si, src_si)); + emit_insn (gen_bswapsi2 (op3_si, op2_si)); + emit_insn (gen_ashldi3 (dest, dest, GEN_INT (32))); + emit_insn (gen_iordi3 (dest, dest, op3)); +}") + +(define_insn "bswapdi2_32bit" + [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,??&r") + (bswap:DI (match_operand:DI 1 "reg_or_mem_operand" "Z,r,r"))) + (clobber (match_scratch:SI 2 "=&b,&b,X"))] + "!TARGET_POWERPC64 && (REG_P (operands[0]) || REG_P (operands[1]))" + "#" + [(set_attr "length" "16,12,36")]) + +(define_split + [(set (match_operand:DI 0 "gpc_reg_operand" "") + (bswap:DI (match_operand:DI 1 "indexed_or_indirect_operand" ""))) + (clobber (match_operand:SI 2 "gpc_reg_operand" ""))] + "!TARGET_POWERPC64 && reload_completed" + [(const_int 0)] + " +{ + rtx dest = operands[0]; + rtx src = operands[1]; + rtx op2 = operands[2]; + rtx dest_hi = simplify_gen_subreg (SImode, dest, DImode, 0); + rtx dest_lo = simplify_gen_subreg (SImode, dest, DImode, 4); + rtx addr1; + rtx addr2; + rtx word_high; + rtx word_low; + + addr1 = XEXP (src, 0); + if (GET_CODE (addr1) == PLUS) + { + emit_insn (gen_add3_insn (op2, XEXP (addr1, 0), GEN_INT (4))); + addr2 = gen_rtx_PLUS (SImode, op2, XEXP (addr1, 1)); + } + else + { + emit_move_insn (op2, GEN_INT (4)); + addr2 = gen_rtx_PLUS (SImode, op2, addr1); + } + + if (BYTES_BIG_ENDIAN) + { + word_high = change_address (src, SImode, addr1); + word_low = change_address (src, SImode, addr2); + } + else + { + word_high = change_address (src, SImode, addr2); + word_low = change_address (src, SImode, addr1); + } + + emit_insn (gen_bswapsi2 (dest_hi, word_low)); + emit_insn (gen_bswapsi2 (dest_lo, word_high)); +}") + +(define_split + [(set (match_operand:DI 0 "indexed_or_indirect_operand" "") + (bswap:DI (match_operand:DI 1 "gpc_reg_operand" ""))) + (clobber (match_operand:SI 2 "gpc_reg_operand" ""))] + "!TARGET_POWERPC64 && reload_completed" + [(const_int 0)] + " +{ + rtx dest = operands[0]; + rtx src = operands[1]; + rtx op2 = operands[2]; + rtx src_high = simplify_gen_subreg (SImode, src, DImode, 0); + rtx src_low = simplify_gen_subreg (SImode, src, DImode, 4); + rtx addr1; + rtx addr2; + rtx word_high; + rtx word_low; + + addr1 = XEXP (dest, 0); + if (GET_CODE (addr1) == PLUS) + { + emit_insn (gen_add3_insn (op2, XEXP (addr1, 0), GEN_INT (4))); + addr2 = gen_rtx_PLUS (SImode, op2, XEXP (addr1, 1)); + } + else + { + emit_move_insn (op2, GEN_INT (4)); + addr2 = gen_rtx_PLUS (SImode, op2, addr1); + } + + if (BYTES_BIG_ENDIAN) + { + word_high = change_address (dest, SImode, addr1); + word_low = change_address (dest, SImode, addr2); + } + else + { + word_high = change_address (dest, SImode, addr2); + word_low = change_address (dest, SImode, addr1); + } + + emit_insn (gen_bswapsi2 (word_high, src_low)); + emit_insn (gen_bswapsi2 (word_low, src_high)); +}") + +(define_split + [(set (match_operand:DI 0 "gpc_reg_operand" "") + (bswap:DI (match_operand:DI 1 "gpc_reg_operand" ""))) + (clobber (match_operand:SI 2 "" ""))] + "!TARGET_POWERPC64 && reload_completed" + [(const_int 0)] + " +{ + rtx dest = operands[0]; + rtx src = operands[1]; + rtx src_high = simplify_gen_subreg (SImode, src, DImode, 0); + rtx src_low = simplify_gen_subreg (SImode, src, DImode, 4); + rtx dest_high = simplify_gen_subreg (SImode, dest, DImode, 0); + rtx dest_low = simplify_gen_subreg (SImode, dest, DImode, 4); + + emit_insn (gen_bswapsi2 (dest_high, src_low)); + emit_insn (gen_bswapsi2 (dest_low, src_high)); +}") + (define_expand "mulsi3" [(use (match_operand:SI 0 "gpc_reg_operand" "")) (use (match_operand:SI 1 "gpc_reg_operand" "")) @@ -2842,7 +3260,7 @@ {rlinm|rlwinm} %0,%1,0,%m2,%M2 {andil.|andi.} %0,%1,%b2 {andiu.|andis.} %0,%1,%u2" - [(set_attr "type" "*,*,compare,compare")]) + [(set_attr "type" "*,*,fast_compare,fast_compare")]) (define_insn "andsi3_nomc" [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r") @@ -2885,7 +3303,8 @@ # # #" - [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare") + [(set_attr "type" "fast_compare,fast_compare,fast_compare,delayed_compare,\ + compare,compare,compare,compare") (set_attr "length" "4,4,4,4,8,8,8,8")]) (define_insn "*andsi3_internal3_mc" @@ -2905,7 +3324,8 @@ # # #" - [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare") + [(set_attr "type" "compare,fast_compare,fast_compare,delayed_compare,compare,\ + compare,compare,compare") (set_attr "length" "8,4,4,4,8,8,8,8")]) (define_split @@ -2964,7 +3384,8 @@ # # #" - [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare") + [(set_attr "type" "fast_compare,fast_compare,fast_compare,delayed_compare,\ + compare,compare,compare,compare") (set_attr "length" "4,4,4,4,8,8,8,8")]) (define_insn "*andsi3_internal5_mc" @@ -2986,23 +3407,10 @@ # # #" - [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare") + [(set_attr "type" "compare,fast_compare,fast_compare,delayed_compare,compare,\ + compare,compare,compare") (set_attr "length" "8,4,4,4,8,8,8,8")]) -(define_insn "*andsi3_internal5_nomc" - [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y,??y,??y,?y") - (compare:CC (and:SI (match_operand:SI 1 "gpc_reg_operand" "%r,r,r,r,r") - (match_operand:SI 2 "and_operand" "r,r,K,L,T")) - (const_int 0))) - (set (match_operand:SI 0 "gpc_reg_operand" "=r,r,r,r,r") - (and:SI (match_dup 1) - (match_dup 2))) - (clobber (match_scratch:CC 4 "=X,X,x,x,X"))] - "TARGET_64BIT && !rs6000_gen_cell_microcode" - "#" - [(set_attr "type" "compare") - (set_attr "length" "8,8,8,8,8")]) - (define_split [(set (match_operand:CC 3 "cc_reg_not_micro_cr0_operand" "") (compare:CC (and:SI (match_operand:SI 1 "gpc_reg_operand" "") @@ -3131,7 +3539,7 @@ "@ %q4. %3,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -3160,7 +3568,7 @@ "@ %q4. %0,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -3285,7 +3693,7 @@ "@ %q4. %3,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -3314,7 +3722,7 @@ "@ %q4. %0,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -3965,6 +4373,17 @@ {rlinm|rlwinm} %0,%1,%h2,0xffffffff" [(set_attr "type" "var_shift_rotate,integer")]) +(define_insn "*rotlsi3_64" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r") + (zero_extend:DI + (rotate:SI (match_operand:SI 1 "gpc_reg_operand" "r,r") + (match_operand:SI 2 "reg_or_cint_operand" "r,i"))))] + "TARGET_64BIT" + "@ + {rlnm|rlwnm} %0,%1,%2,0xffffffff + {rlinm|rlwinm} %0,%1,%h2,0xffffffff" + [(set_attr "type" "var_shift_rotate,integer")]) + (define_insn "*rotlsi3_internal2" [(set (match_operand:CC 0 "cc_reg_operand" "=x,x,?y,?y") (compare:CC (rotate:SI (match_operand:SI 1 "gpc_reg_operand" "r,r,r,r") @@ -4309,6 +4728,17 @@ {sli|slwi} %0,%1,%h2" [(set_attr "type" "var_shift_rotate,shift")]) +(define_insn "*ashlsi3_64" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r") + (zero_extend:DI + (ashift:SI (match_operand:SI 1 "gpc_reg_operand" "r,r") + (match_operand:SI 2 "reg_or_cint_operand" "r,i"))))] + "TARGET_POWERPC64" + "@ + {sl|slw} %0,%1,%2 + {sli|slwi} %0,%1,%h2" + [(set_attr "type" "var_shift_rotate,shift")]) + (define_insn "" [(set (match_operand:CC 0 "cc_reg_operand" "=x,x,?y,?y") (compare:CC (ashift:SI (match_operand:SI 1 "gpc_reg_operand" "r,r,r,r") @@ -4546,6 +4976,17 @@ {sri|srwi} %0,%1,%h2" [(set_attr "type" "integer,var_shift_rotate,shift")]) +(define_insn "*lshrsi3_64" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r") + (zero_extend:DI + (lshiftrt:SI (match_operand:SI 1 "gpc_reg_operand" "r,r") + (match_operand:SI 2 "reg_or_cint_operand" "r,i"))))] + "TARGET_POWERPC64" + "@ + {sr|srw} %0,%1,%2 + {sri|srwi} %0,%1,%h2" + [(set_attr "type" "var_shift_rotate,shift")]) + (define_insn "" [(set (match_operand:CC 0 "cc_reg_operand" "=x,x,x,?y,?y,?y") (compare:CC (lshiftrt:SI (match_operand:SI 1 "gpc_reg_operand" "r,r,r,r,r,r") @@ -4974,6 +5415,17 @@ {srai|srawi} %0,%1,%h2" [(set_attr "type" "var_shift_rotate,shift")]) +(define_insn "*ashrsi3_64" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r") + (sign_extend:DI + (ashiftrt:SI (match_operand:SI 1 "gpc_reg_operand" "r,r") + (match_operand:SI 2 "reg_or_cint_operand" "r,i"))))] + "TARGET_POWERPC64" + "@ + {sra|sraw} %0,%1,%2 + {srai|srawi} %0,%1,%h2" + [(set_attr "type" "var_shift_rotate,shift")]) + (define_insn "" [(set (match_operand:CC 0 "cc_reg_operand" "=x,x,?y,?y") (compare:CC (ashiftrt:SI (match_operand:SI 1 "gpc_reg_operand" "r,r,r,r") @@ -5119,13 +5571,13 @@ (define_expand "extendsfdf2" [(set (match_operand:DF 0 "gpc_reg_operand" "") (float_extend:DF (match_operand:SF 1 "reg_or_none500mem_operand" "")))] - "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)" + "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)" "") (define_insn_and_split "*extendsfdf2_fpr" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f,?f,f") + [(set (match_operand:DF 0 "gpc_reg_operand" "=d,?d,d") (float_extend:DF (match_operand:SF 1 "reg_or_mem_operand" "0,f,m")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" "@ # fmr %0,%1 @@ -5141,13 +5593,13 @@ (define_expand "truncdfsf2" [(set (match_operand:SF 0 "gpc_reg_operand" "") (float_truncate:SF (match_operand:DF 1 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)" + "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)" "") (define_insn "*truncdfsf2_fpr" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (float_truncate:SF (match_operand:DF 1 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + (float_truncate:SF (match_operand:DF 1 "gpc_reg_operand" "d")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" "frsp %0,%1" [(set_attr "type" "fp")]) @@ -5161,33 +5613,33 @@ (define_expand "negsf2" [(set (match_operand:SF 0 "gpc_reg_operand" "") (neg:SF (match_operand:SF 1 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT" + "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT" "") (define_insn "*negsf2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (match_operand:SF 1 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" "fneg %0,%1" [(set_attr "type" "fp")]) (define_expand "abssf2" [(set (match_operand:SF 0 "gpc_reg_operand" "") (abs:SF (match_operand:SF 1 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT" + "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT" "") (define_insn "*abssf2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (abs:SF (match_operand:SF 1 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" "fabs %0,%1" [(set_attr "type" "fp")]) (define_insn "" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (abs:SF (match_operand:SF 1 "gpc_reg_operand" "f"))))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" "fnabs %0,%1" [(set_attr "type" "fp")]) @@ -5195,16 +5647,17 @@ [(set (match_operand:SF 0 "gpc_reg_operand" "") (plus:SF (match_operand:SF 1 "gpc_reg_operand" "") (match_operand:SF 2 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT" + "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT" "") (define_insn "" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (plus:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" "fadds %0,%1,%2" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_addsub_s")]) (define_insn "" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") @@ -5218,16 +5671,17 @@ [(set (match_operand:SF 0 "gpc_reg_operand" "") (minus:SF (match_operand:SF 1 "gpc_reg_operand" "") (match_operand:SF 2 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT" + "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT" "") (define_insn "" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (match_operand:SF 1 "gpc_reg_operand" "f") (match_operand:SF 2 "gpc_reg_operand" "f")))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" "fsubs %0,%1,%2" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_addsub_s")]) (define_insn "" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") @@ -5241,16 +5695,17 @@ [(set (match_operand:SF 0 "gpc_reg_operand" "") (mult:SF (match_operand:SF 1 "gpc_reg_operand" "") (match_operand:SF 2 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT" + "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT" "") (define_insn "" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" "fmuls %0,%1,%2" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_mul_s")]) (define_insn "" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") @@ -5264,14 +5719,15 @@ [(set (match_operand:SF 0 "gpc_reg_operand" "") (div:SF (match_operand:SF 1 "gpc_reg_operand" "") (match_operand:SF 2 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT" + "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT && !TARGET_SIMPLE_FPU" "") (define_insn "" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (div:SF (match_operand:SF 1 "gpc_reg_operand" "f") (match_operand:SF 2 "gpc_reg_operand" "f")))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_SINGLE_FLOAT && !TARGET_SIMPLE_FPU" "fdivs %0,%1,%2" [(set_attr "type" "sdiv")]) @@ -5279,7 +5735,8 @@ [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (div:SF (match_operand:SF 1 "gpc_reg_operand" "f") (match_operand:SF 2 "gpc_reg_operand" "f")))] - "! TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS" + "! TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_SINGLE_FLOAT && !TARGET_SIMPLE_FPU" "{fd|fdiv} %0,%1,%2" [(set_attr "type" "ddiv")]) @@ -5302,16 +5759,18 @@ "fres %0,%1" [(set_attr "type" "fp")]) -(define_insn "" +(define_insn "*fmaddsf4_powerpc" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD" + "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_SINGLE_FLOAT && TARGET_FUSED_MADD" "fmadds %0,%1,%2,%3" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fmaddsf4_power" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5320,16 +5779,18 @@ "{fma|fmadd} %0,%1,%2,%3" [(set_attr "type" "dmul")]) -(define_insn "" +(define_insn "*fmsubsf4_powerpc" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD" + "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_SINGLE_FLOAT && TARGET_FUSED_MADD" "fmsubs %0,%1,%2,%3" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fmsubsf4_power" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5338,27 +5799,29 @@ "{fms|fmsub} %0,%1,%2,%3" [(set_attr "type" "dmul")]) -(define_insn "" +(define_insn "*fnmaddsf4_powerpc_1" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) (match_operand:SF 3 "gpc_reg_operand" "f"))))] "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && HONOR_SIGNED_ZEROS (SFmode)" + && TARGET_SINGLE_FLOAT && HONOR_SIGNED_ZEROS (SFmode)" "fnmadds %0,%1,%2,%3" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fnmaddsf4_powerpc_2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (mult:SF (neg:SF (match_operand:SF 1 "gpc_reg_operand" "f")) (match_operand:SF 2 "gpc_reg_operand" "f")) (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD + "TARGET_POWERPC && TARGET_SINGLE_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && ! HONOR_SIGNED_ZEROS (SFmode)" "fnmadds %0,%1,%2,%3" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fnmaddsf4_power_1" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5367,7 +5830,7 @@ "{fnma|fnmadd} %0,%1,%2,%3" [(set_attr "type" "dmul")]) -(define_insn "" +(define_insn "*fnmaddsf4_power_2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (mult:SF (neg:SF (match_operand:SF 1 "gpc_reg_operand" "f")) (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5377,27 +5840,29 @@ "{fnma|fnmadd} %0,%1,%2,%3" [(set_attr "type" "dmul")]) -(define_insn "" +(define_insn "*fnmsubsf4_powerpc_1" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) (match_operand:SF 3 "gpc_reg_operand" "f"))))] "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && HONOR_SIGNED_ZEROS (SFmode)" + && TARGET_SINGLE_FLOAT && HONOR_SIGNED_ZEROS (SFmode)" "fnmsubs %0,%1,%2,%3" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fnmsubsf4_powerpc_2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (match_operand:SF 3 "gpc_reg_operand" "f") (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f"))))] "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && ! HONOR_SIGNED_ZEROS (SFmode)" + && TARGET_SINGLE_FLOAT && ! HONOR_SIGNED_ZEROS (SFmode)" "fnmsubs %0,%1,%2,%3" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fnmsubsf4_power_1" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5406,7 +5871,7 @@ "{fnms|fnmsub} %0,%1,%2,%3" [(set_attr "type" "dmul")]) -(define_insn "" +(define_insn "*fnmsubsf4_power_2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (match_operand:SF 3 "gpc_reg_operand" "f") (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") @@ -5419,20 +5884,24 @@ (define_expand "sqrtsf2" [(set (match_operand:SF 0 "gpc_reg_operand" "") (sqrt:SF (match_operand:SF 1 "gpc_reg_operand" "")))] - "(TARGET_PPC_GPOPT || TARGET_POWER2) && TARGET_HARD_FLOAT && TARGET_FPRS" + "(TARGET_PPC_GPOPT || TARGET_POWER2 || TARGET_XILINX_FPU) + && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT + && !TARGET_SIMPLE_FPU" "") (define_insn "" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (sqrt:SF (match_operand:SF 1 "gpc_reg_operand" "f")))] - "TARGET_PPC_GPOPT && TARGET_HARD_FLOAT && TARGET_FPRS" + "(TARGET_PPC_GPOPT || TARGET_XILINX_FPU) && TARGET_HARD_FLOAT + && TARGET_FPRS && TARGET_SINGLE_FLOAT && !TARGET_SIMPLE_FPU" "fsqrts %0,%1" [(set_attr "type" "ssqrt")]) (define_insn "" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (sqrt:SF (match_operand:SF 1 "gpc_reg_operand" "f")))] - "TARGET_POWER2 && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_POWER2 && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_SINGLE_FLOAT && !TARGET_SIMPLE_FPU" "fsqrt %0,%1" [(set_attr "type" "dsqrt")]) @@ -5465,7 +5934,7 @@ (match_dup 5)) (match_dup 3) (match_dup 4)))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT && !HONOR_NANS (SFmode) && !HONOR_SIGNED_ZEROS (SFmode)" { operands[3] = gen_reg_rtx (SFmode); @@ -5483,9 +5952,18 @@ (match_dup 5)) (match_dup 3) (match_dup 4)))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS - && !HONOR_NANS (DFmode) && !HONOR_SIGNED_ZEROS (DFmode)" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && ((TARGET_PPC_GFXOPT + && !HONOR_NANS (DFmode) + && !HONOR_SIGNED_ZEROS (DFmode)) + || VECTOR_UNIT_VSX_P (DFmode))" { + if (VECTOR_UNIT_VSX_P (DFmode)) + { + emit_insn (gen_vsx_copysigndf3 (operands[0], operands[1], + operands[2], CONST0_RTX (DFmode))); + DONE; + } operands[3] = gen_reg_rtx (DFmode); operands[4] = gen_reg_rtx (DFmode); operands[5] = CONST0_RTX (DFmode); @@ -5501,7 +5979,8 @@ (match_operand:SF 2 "gpc_reg_operand" "")) (match_dup 1) (match_dup 2)))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && !flag_trapping_math" + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_SINGLE_FLOAT && !flag_trapping_math" "{ rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]); DONE;}") (define_expand "sminsf3" @@ -5510,7 +5989,8 @@ (match_operand:SF 2 "gpc_reg_operand" "")) (match_dup 2) (match_dup 1)))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && !flag_trapping_math" + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_SINGLE_FLOAT && !flag_trapping_math" "{ rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]); DONE;}") (define_split @@ -5518,7 +5998,8 @@ (match_operator:SF 3 "min_max_operator" [(match_operand:SF 1 "gpc_reg_operand" "") (match_operand:SF 2 "gpc_reg_operand" "")]))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && !flag_trapping_math" + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_SINGLE_FLOAT && !flag_trapping_math" [(const_int 0)] " { rs6000_emit_minmax (operands[0], GET_CODE (operands[3]), @@ -5526,12 +6007,12 @@ DONE; }") -(define_expand "movsicc" - [(set (match_operand:SI 0 "gpc_reg_operand" "") - (if_then_else:SI (match_operand 1 "comparison_operator" "") - (match_operand:SI 2 "gpc_reg_operand" "") - (match_operand:SI 3 "gpc_reg_operand" "")))] - "TARGET_ISEL" +(define_expand "mov<mode>cc" + [(set (match_operand:GPR 0 "gpc_reg_operand" "") + (if_then_else:GPR (match_operand 1 "comparison_operator" "") + (match_operand:GPR 2 "gpc_reg_operand" "") + (match_operand:GPR 3 "gpc_reg_operand" "")))] + "TARGET_ISEL<sel>" " { if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3])) @@ -5548,28 +6029,28 @@ ;; leave out the mode in operand 4 and use one pattern, but reload can ;; change the mode underneath our feet and then gets confused trying ;; to reload the value. -(define_insn "isel_signed" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (if_then_else:SI +(define_insn "isel_signed_<mode>" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (if_then_else:GPR (match_operator 1 "comparison_operator" [(match_operand:CC 4 "cc_reg_operand" "y") (const_int 0)]) - (match_operand:SI 2 "gpc_reg_operand" "b") - (match_operand:SI 3 "gpc_reg_operand" "b")))] - "TARGET_ISEL" + (match_operand:GPR 2 "gpc_reg_operand" "b") + (match_operand:GPR 3 "gpc_reg_operand" "b")))] + "TARGET_ISEL<sel>" "* { return output_isel (operands); }" [(set_attr "length" "4")]) -(define_insn "isel_unsigned" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (if_then_else:SI +(define_insn "isel_unsigned_<mode>" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (if_then_else:GPR (match_operator 1 "comparison_operator" [(match_operand:CCUNS 4 "cc_reg_operand" "y") (const_int 0)]) - (match_operand:SI 2 "gpc_reg_operand" "b") - (match_operand:SI 3 "gpc_reg_operand" "b")))] - "TARGET_ISEL" + (match_operand:GPR 2 "gpc_reg_operand" "b") + (match_operand:GPR 3 "gpc_reg_operand" "b")))] + "TARGET_ISEL<sel>" "* { return output_isel (operands); }" [(set_attr "length" "4")]) @@ -5579,7 +6060,7 @@ (if_then_else:SF (match_operand 1 "comparison_operator" "") (match_operand:SF 2 "gpc_reg_operand" "") (match_operand:SF 3 "gpc_reg_operand" "")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" " { if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3])) @@ -5594,50 +6075,53 @@ (match_operand:SF 4 "zero_fp_constant" "F")) (match_operand:SF 2 "gpc_reg_operand" "f") (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" "fsel %0,%1,%2,%3" [(set_attr "type" "fp")]) (define_insn "*fseldfsf4" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (if_then_else:SF (ge (match_operand:DF 1 "gpc_reg_operand" "f") + (if_then_else:SF (ge (match_operand:DF 1 "gpc_reg_operand" "d") (match_operand:DF 4 "zero_fp_constant" "F")) (match_operand:SF 2 "gpc_reg_operand" "f") (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_SINGLE_FLOAT" "fsel %0,%1,%2,%3" [(set_attr "type" "fp")]) (define_expand "negdf2" [(set (match_operand:DF 0 "gpc_reg_operand" "") (neg:DF (match_operand:DF 1 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)" + "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)" "") (define_insn "*negdf2_fpr" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (neg:DF (match_operand:DF 1 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (neg:DF (match_operand:DF 1 "gpc_reg_operand" "d")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "fneg %0,%1" [(set_attr "type" "fp")]) (define_expand "absdf2" [(set (match_operand:DF 0 "gpc_reg_operand" "") (abs:DF (match_operand:DF 1 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)" + "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)" "") (define_insn "*absdf2_fpr" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (abs:DF (match_operand:DF 1 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (abs:DF (match_operand:DF 1 "gpc_reg_operand" "d")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "fabs %0,%1" [(set_attr "type" "fp")]) (define_insn "*nabsdf2_fpr" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (neg:DF (abs:DF (match_operand:DF 1 "gpc_reg_operand" "f"))))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (neg:DF (abs:DF (match_operand:DF 1 "gpc_reg_operand" "d"))))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "fnabs %0,%1" [(set_attr "type" "fp")]) @@ -5645,66 +6129,75 @@ [(set (match_operand:DF 0 "gpc_reg_operand" "") (plus:DF (match_operand:DF 1 "gpc_reg_operand" "") (match_operand:DF 2 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)" + "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)" "") (define_insn "*adddf3_fpr" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (plus:DF (match_operand:DF 1 "gpc_reg_operand" "%f") - (match_operand:DF 2 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (plus:DF (match_operand:DF 1 "gpc_reg_operand" "%d") + (match_operand:DF 2 "gpc_reg_operand" "d")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "{fa|fadd} %0,%1,%2" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_addsub_d")]) (define_expand "subdf3" [(set (match_operand:DF 0 "gpc_reg_operand" "") (minus:DF (match_operand:DF 1 "gpc_reg_operand" "") (match_operand:DF 2 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)" + "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)" "") (define_insn "*subdf3_fpr" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (minus:DF (match_operand:DF 1 "gpc_reg_operand" "f") - (match_operand:DF 2 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (minus:DF (match_operand:DF 1 "gpc_reg_operand" "d") + (match_operand:DF 2 "gpc_reg_operand" "d")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "{fs|fsub} %0,%1,%2" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_addsub_d")]) (define_expand "muldf3" [(set (match_operand:DF 0 "gpc_reg_operand" "") (mult:DF (match_operand:DF 1 "gpc_reg_operand" "") (match_operand:DF 2 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)" + "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)" "") (define_insn "*muldf3_fpr" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") - (match_operand:DF 2 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d") + (match_operand:DF 2 "gpc_reg_operand" "d")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "{fm|fmul} %0,%1,%2" - [(set_attr "type" "dmul")]) + [(set_attr "type" "dmul") + (set_attr "fp_type" "fp_mul_d")]) (define_expand "divdf3" [(set (match_operand:DF 0 "gpc_reg_operand" "") (div:DF (match_operand:DF 1 "gpc_reg_operand" "") (match_operand:DF 2 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)" + "TARGET_HARD_FLOAT + && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE) + && !TARGET_SIMPLE_FPU" "") (define_insn "*divdf3_fpr" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (div:DF (match_operand:DF 1 "gpc_reg_operand" "f") - (match_operand:DF 2 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (div:DF (match_operand:DF 1 "gpc_reg_operand" "d") + (match_operand:DF 2 "gpc_reg_operand" "d")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && !TARGET_SIMPLE_FPU + && !VECTOR_UNIT_VSX_P (DFmode)" "{fd|fdiv} %0,%1,%2" [(set_attr "type" "ddiv")]) (define_expand "recipdf3" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f") - (match_operand:DF 2 "gpc_reg_operand" "f")] + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "d") + (match_operand:DF 2 "gpc_reg_operand" "d")] UNSPEC_FRES))] "TARGET_RECIP && TARGET_HARD_FLOAT && TARGET_POPCNTB && !optimize_size && flag_finite_math_only && !flag_trapping_math" @@ -5713,75 +6206,91 @@ DONE; }) -(define_insn "fred" +(define_expand "fred" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "d")] UNSPEC_FRES))] + "(TARGET_POPCNTB || VECTOR_UNIT_VSX_P (DFmode)) && flag_finite_math_only" + "") + +(define_insn "*fred_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRES))] - "TARGET_POPCNTB && flag_finite_math_only" + "TARGET_POPCNTB && flag_finite_math_only && !VECTOR_UNIT_VSX_P (DFmode)" "fre %0,%1" [(set_attr "type" "fp")]) -(define_insn "" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (plus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") - (match_operand:DF 2 "gpc_reg_operand" "f")) - (match_operand:DF 3 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD" +(define_insn "*fmadddf4_fpr" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (plus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d") + (match_operand:DF 2 "gpc_reg_operand" "d")) + (match_operand:DF 3 "gpc_reg_operand" "d")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT + && VECTOR_UNIT_NONE_P (DFmode)" "{fma|fmadd} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) + [(set_attr "type" "dmul") + (set_attr "fp_type" "fp_maddsub_d")]) -(define_insn "" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (minus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") - (match_operand:DF 2 "gpc_reg_operand" "f")) - (match_operand:DF 3 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD" +(define_insn "*fmsubdf4_fpr" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (minus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d") + (match_operand:DF 2 "gpc_reg_operand" "d")) + (match_operand:DF 3 "gpc_reg_operand" "d")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT + && VECTOR_UNIT_NONE_P (DFmode)" "{fms|fmsub} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) + [(set_attr "type" "dmul") + (set_attr "fp_type" "fp_maddsub_d")]) -(define_insn "" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (neg:DF (plus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") - (match_operand:DF 2 "gpc_reg_operand" "f")) - (match_operand:DF 3 "gpc_reg_operand" "f"))))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && HONOR_SIGNED_ZEROS (DFmode)" +(define_insn "*fnmadddf4_fpr_1" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (neg:DF (plus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d") + (match_operand:DF 2 "gpc_reg_operand" "d")) + (match_operand:DF 3 "gpc_reg_operand" "d"))))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT + && HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)" "{fnma|fnmadd} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) + [(set_attr "type" "dmul") + (set_attr "fp_type" "fp_maddsub_d")]) -(define_insn "" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (minus:DF (mult:DF (neg:DF (match_operand:DF 1 "gpc_reg_operand" "f")) - (match_operand:DF 2 "gpc_reg_operand" "f")) - (match_operand:DF 3 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && ! HONOR_SIGNED_ZEROS (DFmode)" +(define_insn "*fnmadddf4_fpr_2" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (minus:DF (mult:DF (neg:DF (match_operand:DF 1 "gpc_reg_operand" "d")) + (match_operand:DF 2 "gpc_reg_operand" "d")) + (match_operand:DF 3 "gpc_reg_operand" "d")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT + && ! HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)" "{fnma|fnmadd} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) + [(set_attr "type" "dmul") + (set_attr "fp_type" "fp_maddsub_d")]) -(define_insn "" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (neg:DF (minus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") - (match_operand:DF 2 "gpc_reg_operand" "f")) - (match_operand:DF 3 "gpc_reg_operand" "f"))))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && HONOR_SIGNED_ZEROS (DFmode)" +(define_insn "*fnmsubdf4_fpr_1" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (neg:DF (minus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d") + (match_operand:DF 2 "gpc_reg_operand" "d")) + (match_operand:DF 3 "gpc_reg_operand" "d"))))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT + && HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)" "{fnms|fnmsub} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) + [(set_attr "type" "dmul") + (set_attr "fp_type" "fp_maddsub_d")]) -(define_insn "" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (minus:DF (match_operand:DF 3 "gpc_reg_operand" "f") - (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") - (match_operand:DF 2 "gpc_reg_operand" "f"))))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && ! HONOR_SIGNED_ZEROS (DFmode)" +(define_insn "*fnmsubdf4_fpr_2" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (minus:DF (match_operand:DF 3 "gpc_reg_operand" "d") + (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d") + (match_operand:DF 2 "gpc_reg_operand" "d"))))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT + && ! HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)" "{fnms|fnmsub} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) + [(set_attr "type" "dmul") + (set_attr "fp_type" "fp_maddsub_d")]) (define_insn "sqrtdf2" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (sqrt:DF (match_operand:DF 1 "gpc_reg_operand" "f")))] - "(TARGET_PPC_GPOPT || TARGET_POWER2) && TARGET_HARD_FLOAT && TARGET_FPRS" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (sqrt:DF (match_operand:DF 1 "gpc_reg_operand" "d")))] + "(TARGET_PPC_GPOPT || TARGET_POWER2) && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "fsqrt %0,%1" [(set_attr "type" "dsqrt")]) @@ -5794,7 +6303,8 @@ (match_operand:DF 2 "gpc_reg_operand" "")) (match_dup 1) (match_dup 2)))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && !flag_trapping_math" + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !flag_trapping_math" "{ rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]); DONE;}") (define_expand "smindf3" @@ -5803,7 +6313,8 @@ (match_operand:DF 2 "gpc_reg_operand" "")) (match_dup 2) (match_dup 1)))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && !flag_trapping_math" + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !flag_trapping_math" "{ rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]); DONE;}") (define_split @@ -5811,7 +6322,8 @@ (match_operator:DF 3 "min_max_operator" [(match_operand:DF 1 "gpc_reg_operand" "") (match_operand:DF 2 "gpc_reg_operand" "")]))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && !flag_trapping_math" + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !flag_trapping_math" [(const_int 0)] " { rs6000_emit_minmax (operands[0], GET_CODE (operands[3]), @@ -5824,7 +6336,7 @@ (if_then_else:DF (match_operand 1 "comparison_operator" "") (match_operand:DF 2 "gpc_reg_operand" "") (match_operand:DF 3 "gpc_reg_operand" "")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" " { if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3])) @@ -5834,22 +6346,22 @@ }") (define_insn "*fseldfdf4" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (if_then_else:DF (ge (match_operand:DF 1 "gpc_reg_operand" "f") + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (if_then_else:DF (ge (match_operand:DF 1 "gpc_reg_operand" "d") (match_operand:DF 4 "zero_fp_constant" "F")) - (match_operand:DF 2 "gpc_reg_operand" "f") - (match_operand:DF 3 "gpc_reg_operand" "f")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS" + (match_operand:DF 2 "gpc_reg_operand" "d") + (match_operand:DF 3 "gpc_reg_operand" "d")))] + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" "fsel %0,%1,%2,%3" [(set_attr "type" "fp")]) (define_insn "*fselsfdf4" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") (if_then_else:DF (ge (match_operand:SF 1 "gpc_reg_operand" "f") (match_operand:SF 4 "zero_fp_constant" "F")) - (match_operand:DF 2 "gpc_reg_operand" "f") - (match_operand:DF 3 "gpc_reg_operand" "f")))] - "TARGET_PPC_GFXOPT" + (match_operand:DF 2 "gpc_reg_operand" "d") + (match_operand:DF 3 "gpc_reg_operand" "d")))] + "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_SINGLE_FLOAT" "fsel %0,%1,%2,%3" [(set_attr "type" "fp")]) @@ -5858,13 +6370,25 @@ (define_expand "fixuns_truncsfsi2" [(set (match_operand:SI 0 "gpc_reg_operand" "") (unsigned_fix:SI (match_operand:SF 1 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && !TARGET_FPRS" + "TARGET_HARD_FLOAT && !TARGET_FPRS && TARGET_SINGLE_FLOAT" "") (define_expand "fix_truncsfsi2" + [(set (match_operand:SI 0 "gpc_reg_operand" "") + (fix:SI (match_operand:SF 1 "gpc_reg_operand" "")))] + "TARGET_HARD_FLOAT && !TARGET_FPRS && TARGET_SINGLE_FLOAT" + "") + +(define_expand "fixuns_truncdfsi2" [(set (match_operand:SI 0 "gpc_reg_operand" "") - (fix:SI (match_operand:SF 1 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && !TARGET_FPRS" + (unsigned_fix:SI (match_operand:DF 1 "gpc_reg_operand" "")))] + "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE" + "") + +(define_expand "fixuns_truncdfdi2" + [(set (match_operand:DI 0 "register_operand" "") + (unsigned_fix:DI (match_operand:DF 1 "register_operand" "")))] + "TARGET_HARD_FLOAT && TARGET_VSX" "") ; For each of these conversions, there is a define_expand, a define_insn @@ -5880,7 +6404,8 @@ (clobber (match_dup 4)) (clobber (match_dup 5)) (clobber (match_dup 6))])] - "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)" + "TARGET_HARD_FLOAT + && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)" " { if (TARGET_E500_DOUBLE) @@ -5888,18 +6413,10 @@ emit_insn (gen_spe_floatsidf2 (operands[0], operands[1])); DONE; } - if (TARGET_POWERPC64 && TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS) - { - rtx t1 = gen_reg_rtx (DImode); - emit_insn (gen_floatsidf_ppc64_mfpgpr (operands[0], operands[1], t1)); - DONE; - } if (TARGET_POWERPC64) { - rtx mem = assign_stack_temp (DImode, GET_MODE_SIZE (DImode), 0); - rtx t1 = gen_reg_rtx (DImode); - rtx t2 = gen_reg_rtx (DImode); - emit_insn (gen_floatsidf_ppc64 (operands[0], operands[1], mem, t1, t2)); + rtx x = convert_to_mode (DImode, operands[1], 0); + emit_insn (gen_floatdidf2 (operands[0], x)); DONE; } @@ -5911,16 +6428,16 @@ }") (define_insn_and_split "*floatsidf2_internal" - [(set (match_operand:DF 0 "gpc_reg_operand" "=&f") + [(set (match_operand:DF 0 "gpc_reg_operand" "=&d") (float:DF (match_operand:SI 1 "gpc_reg_operand" "r"))) (use (match_operand:SI 2 "gpc_reg_operand" "r")) - (use (match_operand:DF 3 "gpc_reg_operand" "f")) + (use (match_operand:DF 3 "gpc_reg_operand" "d")) (clobber (match_operand:DF 4 "offsettable_mem_operand" "=o")) - (clobber (match_operand:DF 5 "gpc_reg_operand" "=&f")) + (clobber (match_operand:DF 5 "gpc_reg_operand" "=&d")) (clobber (match_operand:SI 6 "gpc_reg_operand" "=&r"))] - "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS" + "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" "#" - "&& (can_create_pseudo_p () || offsettable_nonstrict_memref_p (operands[4]))" + "" [(pc)] " { @@ -5947,7 +6464,7 @@ (define_expand "floatunssisf2" [(set (match_operand:SF 0 "gpc_reg_operand" "") (unsigned_float:SF (match_operand:SI 1 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && !TARGET_FPRS" + "TARGET_HARD_FLOAT && !TARGET_FPRS && TARGET_SINGLE_FLOAT" "") (define_expand "floatunssidf2" @@ -5957,7 +6474,7 @@ (use (match_dup 3)) (clobber (match_dup 4)) (clobber (match_dup 5))])] - "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)" + "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)" " { if (TARGET_E500_DOUBLE) @@ -5967,11 +6484,8 @@ } if (TARGET_POWERPC64) { - rtx mem = assign_stack_temp (DImode, GET_MODE_SIZE (DImode), 0); - rtx t1 = gen_reg_rtx (DImode); - rtx t2 = gen_reg_rtx (DImode); - emit_insn (gen_floatunssidf_ppc64 (operands[0], operands[1], mem, - t1, t2)); + rtx x = convert_to_mode (DImode, operands[1], 1); + emit_insn (gen_floatdidf2 (operands[0], x)); DONE; } @@ -5982,15 +6496,15 @@ }") (define_insn_and_split "*floatunssidf2_internal" - [(set (match_operand:DF 0 "gpc_reg_operand" "=&f") + [(set (match_operand:DF 0 "gpc_reg_operand" "=&d") (unsigned_float:DF (match_operand:SI 1 "gpc_reg_operand" "r"))) (use (match_operand:SI 2 "gpc_reg_operand" "r")) - (use (match_operand:DF 3 "gpc_reg_operand" "f")) + (use (match_operand:DF 3 "gpc_reg_operand" "d")) (clobber (match_operand:DF 4 "offsettable_mem_operand" "=o")) - (clobber (match_operand:DF 5 "gpc_reg_operand" "=&f"))] - "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS" + (clobber (match_operand:DF 5 "gpc_reg_operand" "=&d"))] + "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" "#" - "&& (can_create_pseudo_p () || offsettable_nonstrict_memref_p (operands[4]))" + "" [(pc)] " { @@ -6018,7 +6532,7 @@ (clobber (match_dup 2)) (clobber (match_dup 3))])] "(TARGET_POWER2 || TARGET_POWERPC) - && TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)" + && TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)" " { if (TARGET_E500_DOUBLE) @@ -6051,12 +6565,13 @@ (define_insn_and_split "*fix_truncdfsi2_internal" [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (fix:SI (match_operand:DF 1 "gpc_reg_operand" "f"))) - (clobber (match_operand:DI 2 "gpc_reg_operand" "=f")) + (fix:SI (match_operand:DF 1 "gpc_reg_operand" "d"))) + (clobber (match_operand:DI 2 "gpc_reg_operand" "=d")) (clobber (match_operand:DI 3 "offsettable_mem_operand" "=o"))] - "(TARGET_POWER2 || TARGET_POWERPC) && TARGET_HARD_FLOAT && TARGET_FPRS" + "(TARGET_POWER2 || TARGET_POWERPC) && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_DOUBLE_FLOAT" "#" - "&& (can_create_pseudo_p () || offsettable_nonstrict_memref_p (operands[3]))" + "" [(pc)] " { @@ -6073,9 +6588,10 @@ (define_insn_and_split "fix_truncdfsi2_internal_gfxopt" [(set (match_operand:SI 0 "memory_operand" "=Z") - (fix:SI (match_operand:DF 1 "gpc_reg_operand" "f"))) - (clobber (match_operand:DI 2 "gpc_reg_operand" "=f"))] - "(TARGET_POWER2 || TARGET_POWERPC) && TARGET_HARD_FLOAT && TARGET_FPRS + (fix:SI (match_operand:DF 1 "gpc_reg_operand" "d"))) + (clobber (match_operand:DI 2 "gpc_reg_operand" "=d"))] + "(TARGET_POWER2 || TARGET_POWERPC) && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_DOUBLE_FLOAT && TARGET_PPC_GFXOPT" "#" "&& 1" @@ -6090,10 +6606,11 @@ (define_insn_and_split "fix_truncdfsi2_mfpgpr" [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (fix:SI (match_operand:DF 1 "gpc_reg_operand" "f"))) - (clobber (match_operand:DI 2 "gpc_reg_operand" "=f")) + (fix:SI (match_operand:DF 1 "gpc_reg_operand" "d"))) + (clobber (match_operand:DI 2 "gpc_reg_operand" "=d")) (clobber (match_operand:DI 3 "gpc_reg_operand" "=r"))] - "TARGET_POWERPC64 && TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_POWERPC64 && TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_DOUBLE_FLOAT" "#" "&& 1" [(set (match_dup 2) (unspec:DI [(fix:SI (match_dup 1))] UNSPEC_FCTIWZ)) @@ -6107,73 +6624,102 @@ ; because the first makes it clear that operand 0 is not live ; before the instruction. (define_insn "fctiwz" - [(set (match_operand:DI 0 "gpc_reg_operand" "=f") - (unspec:DI [(fix:SI (match_operand:DF 1 "gpc_reg_operand" "f"))] + [(set (match_operand:DI 0 "gpc_reg_operand" "=d") + (unspec:DI [(fix:SI (match_operand:DF 1 "gpc_reg_operand" "d"))] UNSPEC_FCTIWZ))] - "(TARGET_POWER2 || TARGET_POWERPC) && TARGET_HARD_FLOAT && TARGET_FPRS" + "(TARGET_POWER2 || TARGET_POWERPC) && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_DOUBLE_FLOAT" "{fcirz|fctiwz} %0,%1" [(set_attr "type" "fp")]) -(define_insn "btruncdf2" +(define_expand "btruncdf2" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "d")] UNSPEC_FRIZ))] + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "") + +(define_insn "*btruncdf2_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIZ))] - "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "friz %0,%1" [(set_attr "type" "fp")]) (define_insn "btruncsf2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (unspec:SF [(match_operand:SF 1 "gpc_reg_operand" "f")] UNSPEC_FRIZ))] - "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" "friz %0,%1" [(set_attr "type" "fp")]) -(define_insn "ceildf2" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIP))] - "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS" +(define_expand "ceildf2" + [(set (match_operand:DF 0 "gpc_reg_operand" "") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "")] UNSPEC_FRIP))] + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "") + +(define_insn "*ceildf2_fpr" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "d")] UNSPEC_FRIP))] + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "frip %0,%1" [(set_attr "type" "fp")]) (define_insn "ceilsf2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (unspec:SF [(match_operand:SF 1 "gpc_reg_operand" "f")] UNSPEC_FRIP))] - "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT " "frip %0,%1" [(set_attr "type" "fp")]) -(define_insn "floordf2" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIM))] - "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS" +(define_expand "floordf2" + [(set (match_operand:DF 0 "gpc_reg_operand" "") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "")] UNSPEC_FRIM))] + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "") + +(define_insn "*floordf2_fpr" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "d")] UNSPEC_FRIM))] + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "frim %0,%1" [(set_attr "type" "fp")]) (define_insn "floorsf2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (unspec:SF [(match_operand:SF 1 "gpc_reg_operand" "f")] UNSPEC_FRIM))] - "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT " "frim %0,%1" [(set_attr "type" "fp")]) +;; No VSX equivalent to frin (define_insn "rounddf2" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIN))] - "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "d")] UNSPEC_FRIN))] + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" "frin %0,%1" [(set_attr "type" "fp")]) (define_insn "roundsf2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (unspec:SF [(match_operand:SF 1 "gpc_reg_operand" "f")] UNSPEC_FRIN))] - "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT " "frin %0,%1" [(set_attr "type" "fp")]) +(define_expand "ftruncdf2" + [(set (match_operand:DF 0 "gpc_reg_operand" "") + (fix:DF (match_operand:DF 1 "gpc_reg_operand" "")))] + "VECTOR_UNIT_VSX_P (DFmode)" + "") + ; An UNSPEC is used so we don't have to support SImode in FP registers. (define_insn "stfiwx" [(set (match_operand:SI 0 "memory_operand" "=Z") - (unspec:SI [(match_operand:DI 1 "gpc_reg_operand" "f")] + (unspec:SI [(match_operand:DI 1 "gpc_reg_operand" "d")] UNSPEC_STFIWX))] "TARGET_PPC_GFXOPT" "stfiwx %1,%y0" @@ -6185,65 +6731,47 @@ "TARGET_HARD_FLOAT && !TARGET_FPRS" "") -(define_insn "floatdidf2" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (float:DF (match_operand:DI 1 "gpc_reg_operand" "*f")))] - "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS" +(define_expand "floatdidf2" + [(set (match_operand:DF 0 "gpc_reg_operand" "") + (float:DF (match_operand:DI 1 "gpc_reg_operand" "")))] + "(TARGET_POWERPC64 || TARGET_XILINX_FPU || VECTOR_UNIT_VSX_P (DFmode)) + && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS" + "") + +(define_insn "*floatdidf2_fpr" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (float:DF (match_operand:DI 1 "gpc_reg_operand" "!d#r")))] + "(TARGET_POWERPC64 || TARGET_XILINX_FPU) + && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS + && !VECTOR_UNIT_VSX_P (DFmode)" "fcfid %0,%1" [(set_attr "type" "fp")]) -(define_insn_and_split "floatsidf_ppc64_mfpgpr" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (float:DF (match_operand:SI 1 "gpc_reg_operand" "r"))) - (clobber (match_operand:DI 2 "gpc_reg_operand" "=r"))] - "TARGET_POWERPC64 && TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS" - "#" - "&& 1" - [(set (match_dup 2) (sign_extend:DI (match_dup 1))) - (set (match_dup 0) (float:DF (match_dup 2)))] - "") - -(define_insn_and_split "floatsidf_ppc64" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (float:DF (match_operand:SI 1 "gpc_reg_operand" "r"))) - (clobber (match_operand:DI 2 "offsettable_mem_operand" "=o")) - (clobber (match_operand:DI 3 "gpc_reg_operand" "=r")) - (clobber (match_operand:DI 4 "gpc_reg_operand" "=f"))] - "TARGET_POWERPC64 && !TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS" - "#" - "&& 1" - [(set (match_dup 3) (sign_extend:DI (match_dup 1))) - (set (match_dup 2) (match_dup 3)) - (set (match_dup 4) (match_dup 2)) - (set (match_dup 0) (float:DF (match_dup 4)))] +(define_expand "floatunsdidf2" + [(set (match_operand:DF 0 "gpc_reg_operand" "") + (unsigned_float:DF (match_operand:DI 1 "gpc_reg_operand" "")))] + "TARGET_VSX" "") -(define_insn_and_split "floatunssidf_ppc64" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (unsigned_float:DF (match_operand:SI 1 "gpc_reg_operand" "r"))) - (clobber (match_operand:DI 2 "offsettable_mem_operand" "=o")) - (clobber (match_operand:DI 3 "gpc_reg_operand" "=r")) - (clobber (match_operand:DI 4 "gpc_reg_operand" "=f"))] - "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS" - "#" - "&& 1" - [(set (match_dup 3) (zero_extend:DI (match_dup 1))) - (set (match_dup 2) (match_dup 3)) - (set (match_dup 4) (match_dup 2)) - (set (match_dup 0) (float:DF (match_dup 4)))] +(define_expand "fix_truncdfdi2" + [(set (match_operand:DI 0 "gpc_reg_operand" "") + (fix:DI (match_operand:DF 1 "gpc_reg_operand" "")))] + "(TARGET_POWERPC64 || TARGET_XILINX_FPU || VECTOR_UNIT_VSX_P (DFmode)) + && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS" "") -(define_insn "fix_truncdfdi2" - [(set (match_operand:DI 0 "gpc_reg_operand" "=*f") - (fix:DI (match_operand:DF 1 "gpc_reg_operand" "f")))] - "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS" +(define_insn "*fix_truncdfdi2_fpr" + [(set (match_operand:DI 0 "gpc_reg_operand" "=!d#r") + (fix:DI (match_operand:DF 1 "gpc_reg_operand" "d")))] + "(TARGET_POWERPC64 || TARGET_XILINX_FPU) && TARGET_HARD_FLOAT + && TARGET_DOUBLE_FLOAT && TARGET_FPRS && !VECTOR_UNIT_VSX_P (DFmode)" "fctidz %0,%1" [(set_attr "type" "fp")]) (define_expand "floatdisf2" [(set (match_operand:SF 0 "gpc_reg_operand" "") (float:SF (match_operand:DI 1 "gpc_reg_operand" "")))] - "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT " " { rtx val = operands[1]; @@ -6263,9 +6791,9 @@ ;; from double rounding. (define_insn_and_split "floatdisf2_internal1" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (float:SF (match_operand:DI 1 "gpc_reg_operand" "*f"))) - (clobber (match_scratch:DF 2 "=f"))] - "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS" + (float:SF (match_operand:DI 1 "gpc_reg_operand" "!d#r"))) + (clobber (match_scratch:DF 2 "=d"))] + "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" "#" "&& reload_completed" [(set (match_dup 2) @@ -6299,7 +6827,7 @@ (label_ref (match_operand:DI 2 "" "")) (pc))) (set (match_dup 0) (match_dup 1))] - "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" " { operands[3] = gen_reg_rtx (DImode); @@ -7625,7 +8153,7 @@ andi. %0,%1,%b2 andis. %0,%1,%u2 #" - [(set_attr "type" "*,*,*,compare,compare,*") + [(set_attr "type" "*,*,*,fast_compare,fast_compare,*") (set_attr "length" "4,4,4,4,4,8")]) (define_insn "anddi3_nomc" @@ -7683,21 +8211,11 @@ # # #" - [(set_attr "type" "compare,compare,delayed_compare,compare,compare,compare,compare,compare,compare,compare,compare,compare") + [(set_attr "type" "fast_compare,compare,delayed_compare,fast_compare,\ + fast_compare,compare,compare,compare,compare,compare,\ + compare,compare") (set_attr "length" "4,4,4,4,4,8,8,8,8,8,8,12")]) -(define_insn "*anddi3_internal2_nomc" - [(set (match_operand:CC 0 "cc_reg_operand" "=x,?y,?y,??y,??y,?y") - (compare:CC (and:DI (match_operand:DI 1 "gpc_reg_operand" "%r,r,r,r,r,r") - (match_operand:DI 2 "and64_2_operand" "t,r,S,K,J,t")) - (const_int 0))) - (clobber (match_scratch:DI 3 "=r,r,r,r,r,r")) - (clobber (match_scratch:CC 4 "=X,X,X,x,x,X"))] - "TARGET_64BIT && !rs6000_gen_cell_microcode" - "#" - [(set_attr "type" "delayed_compare,compare,compare,compare,compare,compare") - (set_attr "length" "8,8,8,8,8,12")]) - (define_split [(set (match_operand:CC 0 "cc_reg_operand" "") (compare:CC (and:DI (match_operand:DI 1 "gpc_reg_operand" "") @@ -7746,21 +8264,11 @@ # # #" - [(set_attr "type" "compare,compare,delayed_compare,compare,compare,compare,compare,compare,compare,compare,compare,compare") + [(set_attr "type" "fast_compare,compare,delayed_compare,fast_compare,\ + fast_compare,compare,compare,compare,compare,compare,\ + compare,compare") (set_attr "length" "4,4,4,4,4,8,8,8,8,8,8,12")]) -(define_insn "*anddi3_internal3_nomc" - [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y,?y,??y,??y,?y") - (compare:CC (and:DI (match_operand:DI 1 "gpc_reg_operand" "%r,r,r,r,r,r") - (match_operand:DI 2 "and64_2_operand" "t,r,S,K,J,t")) - (const_int 0))) - (set (match_operand:DI 0 "gpc_reg_operand" "=r,r,r,r,r,r") - (and:DI (match_dup 1) (match_dup 2))) - (clobber (match_scratch:CC 4 "=X,X,X,x,x,X"))] - "TARGET_64BIT && !rs6000_gen_cell_microcode" - "#" - [(set_attr "type" "delayed_compare,compare,compare,compare,compare,compare") - (set_attr "length" "8,8,8,8,8,12")]) (define_split [(set (match_operand:CC 3 "cc_reg_not_micro_cr0_operand" "") (compare:CC (and:DI (match_operand:DI 1 "gpc_reg_operand" "") @@ -7898,7 +8406,7 @@ "@ %q4. %3,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -7917,7 +8425,7 @@ (define_insn "*booldi3_internal3" [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y") - (compare:CC (match_operator:DI 4 "boolean_operator" + (compare:CC (match_operator:DI 4 "boolean_or_operator" [(match_operand:DI 1 "gpc_reg_operand" "%r,r") (match_operand:DI 2 "gpc_reg_operand" "r,r")]) (const_int 0))) @@ -7927,7 +8435,7 @@ "@ %q4. %0,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -7998,7 +8506,7 @@ "@ %q4. %3,%2,%1 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -8027,7 +8535,7 @@ "@ %q4. %0,%2,%1 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -8064,7 +8572,7 @@ "@ %q4. %3,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -8093,7 +8601,7 @@ "@ %q4. %0,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -8110,6 +8618,51 @@ (compare:CC (match_dup 0) (const_int 0)))] "") + +(define_expand "smindi3" + [(match_operand:DI 0 "gpc_reg_operand" "") + (match_operand:DI 1 "gpc_reg_operand" "") + (match_operand:DI 2 "gpc_reg_operand" "")] + "TARGET_ISEL64" + " +{ + rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]); + DONE; +}") + +(define_expand "smaxdi3" + [(match_operand:DI 0 "gpc_reg_operand" "") + (match_operand:DI 1 "gpc_reg_operand" "") + (match_operand:DI 2 "gpc_reg_operand" "")] + "TARGET_ISEL64" + " +{ + rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]); + DONE; +}") + +(define_expand "umindi3" + [(match_operand:DI 0 "gpc_reg_operand" "") + (match_operand:DI 1 "gpc_reg_operand" "") + (match_operand:DI 2 "gpc_reg_operand" "")] + "TARGET_ISEL64" + " +{ + rs6000_emit_minmax (operands[0], UMIN, operands[1], operands[2]); + DONE; +}") + +(define_expand "umaxdi3" + [(match_operand:DI 0 "gpc_reg_operand" "") + (match_operand:DI 1 "gpc_reg_operand" "") + (match_operand:DI 2 "gpc_reg_operand" "")] + "TARGET_ISEL64" + " +{ + rs6000_emit_minmax (operands[0], UMAX, operands[1], operands[2]); + DONE; +}") + ;; Now define ways of moving data around. @@ -8183,8 +8736,8 @@ (define_insn "*movsi_internal1" [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "=r,r,r,m,r,r,r,r,r,*q,*c*l,*h,*h") (match_operand:SI 1 "input_operand" "r,U,m,r,I,L,n,R,*h,r,r,r,0"))] - "gpc_reg_operand (operands[0], SImode) - || gpc_reg_operand (operands[1], SImode)" + "!TARGET_SINGLE_FPU && + (gpc_reg_operand (operands[0], SImode) || gpc_reg_operand (operands[1], SImode))" "@ mr %0,%1 {cal|la} %0,%a1 @@ -8202,6 +8755,30 @@ [(set_attr "type" "*,*,load,store,*,*,*,*,mfjmpr,*,mtjmpr,*,*") (set_attr "length" "4,4,4,4,4,4,8,4,4,4,4,4,4")]) +(define_insn "*movsi_internal1_single" + [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "=r,r,r,m,r,r,r,r,r,*q,*c*l,*h,*h,m,*f") + (match_operand:SI 1 "input_operand" "r,U,m,r,I,L,n,R,*h,r,r,r,0,f,m"))] + "TARGET_SINGLE_FPU && + (gpc_reg_operand (operands[0], SImode) || gpc_reg_operand (operands[1], SImode))" + "@ + mr %0,%1 + {cal|la} %0,%a1 + {l%U1%X1|lwz%U1%X1} %0,%1 + {st%U0%X0|stw%U0%X0} %1,%0 + {lil|li} %0,%1 + {liu|lis} %0,%v1 + # + {cal|la} %0,%a1 + mf%1 %0 + mt%0 %1 + mt%0 %1 + mt%0 %1 + {cror 0,0,0|nop} + stfs%U0%X0 %1, %0 + lfs%U1%X1 %0, %1" + [(set_attr "type" "*,*,load,store,*,*,*,*,mfjmpr,*,mtjmpr,*,*,*,*") + (set_attr "length" "4,4,4,4,4,4,8,4,4,4,4,4,4,4,4")]) + ;; Split a load of a large constant into the appropriate two-insn ;; sequence. @@ -8377,7 +8954,7 @@ (match_operand:SF 1 "input_operand" "r,m,r,f,m,f,r,r,h,0,G,Fn"))] "(gpc_reg_operand (operands[0], SFmode) || gpc_reg_operand (operands[1], SFmode)) - && (TARGET_HARD_FLOAT && TARGET_FPRS)" + && (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT)" "@ mr %0,%1 {l%U1%X1|lwz%U1%X1} %0,%1 @@ -8513,9 +9090,9 @@ ;; The "??" is a kludge until we can figure out a more reasonable way ;; of handling these non-offsettable values. (define_insn "*movdf_hardfloat32" - [(set (match_operand:DF 0 "nonimmediate_operand" "=!r,??r,m,f,f,m,!r,!r,!r") - (match_operand:DF 1 "input_operand" "r,m,r,f,m,f,G,H,F"))] - "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS + [(set (match_operand:DF 0 "nonimmediate_operand" "=!r,??r,m,ws,?wa,ws,?wa,Z,?Z,d,d,m,wa,!r,!r,!r") + (match_operand:DF 1 "input_operand" "r,m,r,ws,wa,Z,Z,ws,wa,d,m,d,j,G,H,F"))] + "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && (gpc_reg_operand (operands[0], DFmode) || gpc_reg_operand (operands[1], DFmode))" "* @@ -8593,24 +9170,37 @@ return \"\"; } case 3: - return \"fmr %0,%1\"; case 4: - return \"lfd%U1%X1 %0,%1\"; + return \"xxlor %x0,%x1,%x1\"; case 5: - return \"stfd%U0%X0 %1,%0\"; case 6: + return \"lxsd%U1x %x0,%y1\"; case 7: case 8: + return \"stxsd%U0x %x1,%y0\"; + case 9: + return \"fmr %0,%1\"; + case 10: + return \"lfd%U1%X1 %0,%1\"; + case 11: + return \"stfd%U0%X0 %1,%0\"; + case 12: + return \"xxlxor %x0,%x0,%x0\"; + case 13: + case 14: + case 15: return \"#\"; } }" - [(set_attr "type" "two,load,store,fp,fpload,fpstore,*,*,*") - (set_attr "length" "8,16,16,4,4,4,8,12,16")]) + [(set_attr "type" "two,load,store,fp,fp,fpload,fpload,fpstore,fpstore,fp,fpload,fpstore,vecsimple,*,*,*") + (set_attr "length" "8,16,16,4,4,4,4,4,4,4,4,4,4,8,12,16")]) (define_insn "*movdf_softfloat32" [(set (match_operand:DF 0 "nonimmediate_operand" "=r,r,m,r,r,r") (match_operand:DF 1 "input_operand" "r,m,r,G,H,F"))] - "! TARGET_POWERPC64 && (TARGET_SOFT_FLOAT || TARGET_E500_SINGLE) + "! TARGET_POWERPC64 + && ((TARGET_FPRS && TARGET_SINGLE_FLOAT) + || TARGET_SOFT_FLOAT || TARGET_E500_SINGLE) && (gpc_reg_operand (operands[0], DFmode) || gpc_reg_operand (operands[1], DFmode))" "* @@ -8651,18 +9241,26 @@ ; ld/std require word-aligned displacements -> 'Y' constraint. ; List Y->r and r->Y before r->r for reload. (define_insn "*movdf_hardfloat64_mfpgpr" - [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,!r,f,f,m,*c*l,!r,*h,!r,!r,!r,r,f") - (match_operand:DF 1 "input_operand" "r,Y,r,f,m,f,r,h,0,G,H,F,f,r"))] - "TARGET_POWERPC64 && TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS + [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,!r,ws,?wa,ws,?wa,Z,?Z,d,d,m,wa,*c*l,!r,*h,!r,!r,!r,r,d") + (match_operand:DF 1 "input_operand" "r,Y,r,ws,?wa,Z,Z,ws,wa,d,m,d,j,r,h,0,G,H,F,d,r"))] + "TARGET_POWERPC64 && TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_DOUBLE_FLOAT && (gpc_reg_operand (operands[0], DFmode) || gpc_reg_operand (operands[1], DFmode))" "@ std%U0%X0 %1,%0 ld%U1%X1 %0,%1 mr %0,%1 + xxlor %x0,%x1,%x1 + xxlor %x0,%x1,%x1 + lxsd%U1x %x0,%y1 + lxsd%U1x %x0,%y1 + stxsd%U0x %x1,%y0 + stxsd%U0x %x1,%y0 fmr %0,%1 lfd%U1%X1 %0,%1 stfd%U0%X0 %1,%0 + xxlxor %x0,%x0,%x0 mt%0 %1 mf%1 %0 {cror 0,0,0|nop} @@ -8671,32 +9269,40 @@ # mftgpr %0,%1 mffgpr %0,%1" - [(set_attr "type" "store,load,*,fp,fpload,fpstore,mtjmpr,mfjmpr,*,*,*,*,mftgpr,mffgpr") - (set_attr "length" "4,4,4,4,4,4,4,4,4,8,12,16,4,4")]) + [(set_attr "type" "store,load,*,fp,fp,fpload,fpload,fpstore,fpstore,fp,fpload,fpstore,vecsimple,mtjmpr,mfjmpr,*,*,*,*,mftgpr,mffgpr") + (set_attr "length" "4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,8,12,16,4,4")]) ; ld/std require word-aligned displacements -> 'Y' constraint. ; List Y->r and r->Y before r->r for reload. (define_insn "*movdf_hardfloat64" - [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,!r,f,f,m,*c*l,!r,*h,!r,!r,!r") - (match_operand:DF 1 "input_operand" "r,Y,r,f,m,f,r,h,0,G,H,F"))] - "TARGET_POWERPC64 && !TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS + [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,!r,ws,?wa,ws,?wa,Z,?Z,d,d,m,wa,*c*l,!r,*h,!r,!r,!r") + (match_operand:DF 1 "input_operand" "r,Y,r,ws,wa,Z,Z,ws,wa,d,m,d,j,r,h,0,G,H,F"))] + "TARGET_POWERPC64 && !TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_DOUBLE_FLOAT && (gpc_reg_operand (operands[0], DFmode) || gpc_reg_operand (operands[1], DFmode))" "@ std%U0%X0 %1,%0 ld%U1%X1 %0,%1 mr %0,%1 + xxlor %x0,%x1,%x1 + xxlor %x0,%x1,%x1 + lxsd%U1x %x0,%y1 + lxsd%U1x %x0,%y1 + stxsd%U0x %x1,%y0 + stxsd%U0x %x1,%y0 fmr %0,%1 lfd%U1%X1 %0,%1 stfd%U0%X0 %1,%0 + xxlxor %x0,%x0,%x0 mt%0 %1 mf%1 %0 {cror 0,0,0|nop} # # #" - [(set_attr "type" "store,load,*,fp,fpload,fpstore,mtjmpr,mfjmpr,*,*,*,*") - (set_attr "length" "4,4,4,4,4,4,4,4,4,8,12,16")]) + [(set_attr "type" "store,load,*,fp,fp,fpload,fpload,fpstore,fpstore,fp,fpload,fpstore,vecsimple,mtjmpr,mfjmpr,*,*,*,*") + (set_attr "length" "4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,8,12,16")]) (define_insn "*movdf_softfloat64" [(set (match_operand:DF 0 "nonimmediate_operand" "=r,Y,r,cl,r,r,r,r,*h") @@ -8727,8 +9333,8 @@ ; otherwise reload, given m->f, will try to pick f->f and reload it, ; which doesn't make progress. Likewise r->Y must be before r->r. (define_insn_and_split "*movtf_internal" - [(set (match_operand:TF 0 "nonimmediate_operand" "=o,f,f,r,Y,r") - (match_operand:TF 1 "input_operand" "f,o,f,YGHF,r,r"))] + [(set (match_operand:TF 0 "nonimmediate_operand" "=o,d,d,r,Y,r") + (match_operand:TF 1 "input_operand" "d,o,d,YGHF,r,r"))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128 && (gpc_reg_operand (operands[0], TFmode) @@ -8772,7 +9378,8 @@ (float_extend:TF (match_operand:DF 1 "input_operand" ""))) (use (match_dup 2))])] "!TARGET_IEEEQUAD - && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128" + && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && TARGET_LONG_DOUBLE_128" { operands[2] = CONST0_RTX (DFmode); /* Generate GOT reference early for SVR4 PIC. */ @@ -8781,11 +9388,12 @@ }) (define_insn_and_split "*extenddftf2_internal" - [(set (match_operand:TF 0 "nonimmediate_operand" "=o,f,&f,r") - (float_extend:TF (match_operand:DF 1 "input_operand" "fr,mf,mf,rmGHF"))) - (use (match_operand:DF 2 "zero_reg_mem_operand" "rf,m,f,n"))] + [(set (match_operand:TF 0 "nonimmediate_operand" "=o,d,&d,r") + (float_extend:TF (match_operand:DF 1 "input_operand" "dr,md,md,rmGHF"))) + (use (match_operand:DF 2 "zero_reg_mem_operand" "rd,m,d,n"))] "!TARGET_IEEEQUAD - && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128" + && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && TARGET_LONG_DOUBLE_128" "#" "&& reload_completed" [(pc)] @@ -8823,8 +9431,8 @@ "") (define_insn_and_split "trunctfdf2_internal1" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f,?f") - (float_truncate:DF (match_operand:TF 1 "gpc_reg_operand" "0,f")))] + [(set (match_operand:DF 0 "gpc_reg_operand" "=d,?d") + (float_truncate:DF (match_operand:TF 1 "gpc_reg_operand" "0,d")))] "!TARGET_IEEEQUAD && !TARGET_XL_COMPAT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128" "@ @@ -8839,12 +9447,14 @@ [(set_attr "type" "fp")]) (define_insn "trunctfdf2_internal2" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (float_truncate:DF (match_operand:TF 1 "gpc_reg_operand" "f")))] + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (float_truncate:DF (match_operand:TF 1 "gpc_reg_operand" "d")))] "!TARGET_IEEEQUAD && TARGET_XL_COMPAT - && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128" + && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && TARGET_LONG_DOUBLE_128" "fadd %0,%1,%L1" - [(set_attr "type" "fp")]) + [(set_attr "type" "fp") + (set_attr "fp_type" "fp_addsub_d")]) (define_expand "trunctfsf2" [(set (match_operand:SF 0 "gpc_reg_operand" "") @@ -8863,10 +9473,11 @@ (define_insn_and_split "trunctfsf2_fprs" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (float_truncate:SF (match_operand:TF 1 "gpc_reg_operand" "f"))) - (clobber (match_scratch:DF 2 "=f"))] + (float_truncate:SF (match_operand:TF 1 "gpc_reg_operand" "d"))) + (clobber (match_scratch:DF 2 "=d"))] "!TARGET_IEEEQUAD - && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128" + && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT + && TARGET_LONG_DOUBLE_128" "#" "&& reload_completed" [(set (match_dup 2) @@ -8892,11 +9503,11 @@ ; fadd, but rounding towards zero. ; This is probably not the optimal code sequence. (define_insn "fix_trunc_helper" - [(set (match_operand:DF 0 "gpc_reg_operand" "=f") - (unspec:DF [(match_operand:TF 1 "gpc_reg_operand" "f")] + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (unspec:DF [(match_operand:TF 1 "gpc_reg_operand" "d")] UNSPEC_FIX_TRUNC_TF)) - (clobber (match_operand:DF 2 "gpc_reg_operand" "=&f"))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + (clobber (match_operand:DF 2 "gpc_reg_operand" "=&d"))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" "mffs %2\n\tmtfsb1 31\n\tmtfsb0 30\n\tfadd %0,%1,%L1\n\tmtfsf 1,%2" [(set_attr "type" "fp") (set_attr "length" "20")]) @@ -8936,15 +9547,15 @@ (define_insn_and_split "*fix_trunctfsi2_internal" [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (fix:SI (match_operand:TF 1 "gpc_reg_operand" "f"))) - (clobber (match_operand:DF 2 "gpc_reg_operand" "=f")) - (clobber (match_operand:DF 3 "gpc_reg_operand" "=&f")) - (clobber (match_operand:DI 4 "gpc_reg_operand" "=f")) + (fix:SI (match_operand:TF 1 "gpc_reg_operand" "d"))) + (clobber (match_operand:DF 2 "gpc_reg_operand" "=d")) + (clobber (match_operand:DF 3 "gpc_reg_operand" "=&d")) + (clobber (match_operand:DI 4 "gpc_reg_operand" "=d")) (clobber (match_operand:DI 5 "offsettable_mem_operand" "=o"))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128" "#" - "&& (can_create_pseudo_p () || offsettable_nonstrict_memref_p (operands[5]))" + "" [(pc)] { rtx lowword; @@ -8969,8 +9580,8 @@ "") (define_insn "negtf2_internal" - [(set (match_operand:TF 0 "gpc_reg_operand" "=f") - (neg:TF (match_operand:TF 1 "gpc_reg_operand" "f")))] + [(set (match_operand:TF 0 "gpc_reg_operand" "=d") + (neg:TF (match_operand:TF 1 "gpc_reg_operand" "d")))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128" "* @@ -8995,7 +9606,7 @@ rtx label = gen_label_rtx (); if (TARGET_E500_DOUBLE) { - if (flag_unsafe_math_optimizations) + if (flag_finite_math_only && !flag_trapping_math) emit_insn (gen_spe_abstf2_tst (operands[0], operands[1], label)); else emit_insn (gen_spe_abstf2_cmp (operands[0], operands[1], label)); @@ -9017,7 +9628,8 @@ (pc))) (set (match_dup 6) (neg:DF (match_dup 6)))] "!TARGET_IEEEQUAD - && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128" + && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && TARGET_LONG_DOUBLE_128" " { const int hi_word = FLOAT_WORDS_BIG_ENDIAN ? 0 : GET_MODE_SIZE (DFmode); @@ -9034,8 +9646,8 @@ ; List r->r after r->"o<>", otherwise reload will try to reload a ; non-offsettable address by using r->r which won't make progress. (define_insn "*movdi_internal32" - [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=o<>,r,r,*f,*f,m,r") - (match_operand:DI 1 "input_operand" "r,r,m,f,m,f,IJKnGHF"))] + [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=o<>,r,r,*d,*d,m,r") + (match_operand:DI 1 "input_operand" "r,r,m,d,m,d,IJKnGHF"))] "! TARGET_POWERPC64 && (gpc_reg_operand (operands[0], DImode) || gpc_reg_operand (operands[1], DImode))" @@ -9079,8 +9691,8 @@ { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }) (define_insn "*movdi_mfpgpr" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,m,r,r,r,r,*f,*f,m,r,*h,*h,r,*f") - (match_operand:DI 1 "input_operand" "r,m,r,I,L,nF,R,f,m,f,*h,r,0,*f,r"))] + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,m,r,r,r,r,*d,*d,m,r,*h,*h,r,*d") + (match_operand:DI 1 "input_operand" "r,m,r,I,L,nF,R,d,m,d,*h,r,0,*d,r"))] "TARGET_POWERPC64 && TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS && (gpc_reg_operand (operands[0], DImode) || gpc_reg_operand (operands[1], DImode))" @@ -9104,8 +9716,8 @@ (set_attr "length" "4,4,4,4,4,20,4,4,4,4,4,4,4,4,4")]) (define_insn "*movdi_internal64" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,m,r,r,r,r,*f,*f,m,r,*h,*h") - (match_operand:DI 1 "input_operand" "r,m,r,I,L,nF,R,f,m,f,*h,r,0"))] + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,m,r,r,r,r,*d,*d,m,r,*h,*h") + (match_operand:DI 1 "input_operand" "r,m,r,I,L,nF,R,d,m,d,*h,r,0"))] "TARGET_POWERPC64 && (!TARGET_MFPGPR || !TARGET_HARD_FLOAT || !TARGET_FPRS) && (gpc_reg_operand (operands[0], DImode) || gpc_reg_operand (operands[1], DImode))" @@ -9267,15 +9879,16 @@ (define_insn "*movti_ppc64" [(set (match_operand:TI 0 "nonimmediate_operand" "=r,o<>,r") (match_operand:TI 1 "input_operand" "r,r,m"))] - "TARGET_POWERPC64 && (gpc_reg_operand (operands[0], TImode) - || gpc_reg_operand (operands[1], TImode))" + "(TARGET_POWERPC64 && (gpc_reg_operand (operands[0], TImode) + || gpc_reg_operand (operands[1], TImode))) + && VECTOR_MEM_NONE_P (TImode)" "#" - [(set_attr "type" "*,load,store")]) + [(set_attr "type" "*,store,load")]) (define_split [(set (match_operand:TI 0 "gpc_reg_operand" "") (match_operand:TI 1 "const_double_operand" ""))] - "TARGET_POWERPC64" + "TARGET_POWERPC64 && VECTOR_MEM_NONE_P (TImode)" [(set (match_dup 2) (match_dup 4)) (set (match_dup 3) (match_dup 5))] " @@ -9301,7 +9914,7 @@ (define_split [(set (match_operand:TI 0 "nonimmediate_operand" "") (match_operand:TI 1 "input_operand" ""))] - "reload_completed + "reload_completed && VECTOR_MEM_NONE_P (TImode) && gpr_or_gpr_p (operands[0], operands[1])" [(pc)] { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }) @@ -10047,7 +10660,9 @@ (match_operand:DI 2 "reg_or_aligned_short_operand" "r,I")))) (set (match_operand:DI 0 "gpc_reg_operand" "=b,b") (plus:DI (match_dup 1) (match_dup 2)))] - "TARGET_POWERPC64 && TARGET_UPDATE" + "TARGET_POWERPC64 && TARGET_UPDATE + && (!avoiding_indexed_address_p (DImode) + || !gpc_reg_operand (operands[2], DImode))" "@ ldux %3,%0,%2 ldu %3,%2(%0)" @@ -10059,7 +10674,25 @@ (match_operand:DI 3 "gpc_reg_operand" "r,r")) (set (match_operand:P 0 "gpc_reg_operand" "=b,b") (plus:P (match_dup 1) (match_dup 2)))] - "TARGET_POWERPC64 && TARGET_UPDATE" + "TARGET_POWERPC64 && TARGET_UPDATE + && (!avoiding_indexed_address_p (Pmode) + || !gpc_reg_operand (operands[2], Pmode) + || (REG_P (operands[0]) + && REGNO (operands[0]) == STACK_POINTER_REGNUM))" + "@ + stdux %3,%0,%2 + stdu %3,%2(%0)" + [(set_attr "type" "store_ux,store_u")]) + +;; This pattern is only conditional on TARGET_POWERPC64, as it is +;; needed for stack allocation, even if the user passes -mno-update. +(define_insn "movdi_<mode>_update_stack" + [(set (mem:DI (plus:P (match_operand:P 1 "gpc_reg_operand" "0,0") + (match_operand:P 2 "reg_or_aligned_short_operand" "r,I"))) + (match_operand:DI 3 "gpc_reg_operand" "r,r")) + (set (match_operand:P 0 "gpc_reg_operand" "=b,b") + (plus:P (match_dup 1) (match_dup 2)))] + "TARGET_POWERPC64" "@ stdux %3,%0,%2 stdu %3,%2(%0)" @@ -10071,7 +10704,9 @@ (match_operand:SI 2 "reg_or_short_operand" "r,I")))) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_UPDATE" + "TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ {lux|lwzux} %3,%0,%2 {lu|lwzu} %3,%2(%0)" @@ -10084,7 +10719,8 @@ (match_operand:DI 2 "gpc_reg_operand" "r"))))) (set (match_operand:DI 0 "gpc_reg_operand" "=b") (plus:DI (match_dup 1) (match_dup 2)))] - "TARGET_POWERPC64 && rs6000_gen_cell_microcode" + "TARGET_POWERPC64 && rs6000_gen_cell_microcode + && !avoiding_indexed_address_p (DImode)" "lwaux %3,%0,%2" [(set_attr "type" "load_ext_ux")]) @@ -10094,7 +10730,25 @@ (match_operand:SI 3 "gpc_reg_operand" "r,r")) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_UPDATE" + "TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode) + || (REG_P (operands[0]) + && REGNO (operands[0]) == STACK_POINTER_REGNUM))" + "@ + {stux|stwux} %3,%0,%2 + {stu|stwu} %3,%2(%0)" + [(set_attr "type" "store_ux,store_u")]) + +;; This is an unconditional pattern; needed for stack allocation, even +;; if the user passes -mno-update. +(define_insn "movsi_update_stack" + [(set (mem:SI (plus:SI (match_operand:SI 1 "gpc_reg_operand" "0,0") + (match_operand:SI 2 "reg_or_short_operand" "r,I"))) + (match_operand:SI 3 "gpc_reg_operand" "r,r")) + (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") + (plus:SI (match_dup 1) (match_dup 2)))] + "" "@ {stux|stwux} %3,%0,%2 {stu|stwu} %3,%2(%0)" @@ -10106,7 +10760,9 @@ (match_operand:SI 2 "reg_or_short_operand" "r,I")))) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_UPDATE" + "TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ lhzux %3,%0,%2 lhzu %3,%2(%0)" @@ -10119,7 +10775,9 @@ (match_operand:SI 2 "reg_or_short_operand" "r,I"))))) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_UPDATE" + "TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ lhzux %3,%0,%2 lhzu %3,%2(%0)" @@ -10132,7 +10790,9 @@ (match_operand:SI 2 "reg_or_short_operand" "r,I"))))) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_UPDATE && rs6000_gen_cell_microcode" + "TARGET_UPDATE && rs6000_gen_cell_microcode + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ lhaux %3,%0,%2 lhau %3,%2(%0)" @@ -10144,7 +10804,9 @@ (match_operand:HI 3 "gpc_reg_operand" "r,r")) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_UPDATE" + "TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ sthux %3,%0,%2 sthu %3,%2(%0)" @@ -10156,7 +10818,9 @@ (match_operand:SI 2 "reg_or_short_operand" "r,I")))) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_UPDATE" + "TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ lbzux %3,%0,%2 lbzu %3,%2(%0)" @@ -10169,7 +10833,9 @@ (match_operand:SI 2 "reg_or_short_operand" "r,I"))))) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_UPDATE" + "TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ lbzux %3,%0,%2 lbzu %3,%2(%0)" @@ -10181,7 +10847,9 @@ (match_operand:QI 3 "gpc_reg_operand" "r,r")) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_UPDATE" + "TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ stbux %3,%0,%2 stbu %3,%2(%0)" @@ -10193,7 +10861,9 @@ (match_operand:SI 2 "reg_or_short_operand" "r,I")))) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_UPDATE" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT && TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ lfsux %3,%0,%2 lfsu %3,%2(%0)" @@ -10205,7 +10875,9 @@ (match_operand:SF 3 "gpc_reg_operand" "f,f")) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_UPDATE" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT && TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ stfsux %3,%0,%2 stfsu %3,%2(%0)" @@ -10217,7 +10889,9 @@ (match_operand:SI 2 "reg_or_short_operand" "r,I")))) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "(TARGET_SOFT_FLOAT || !TARGET_FPRS) && TARGET_UPDATE" + "(TARGET_SOFT_FLOAT || !TARGET_FPRS) && TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ {lux|lwzux} %3,%0,%2 {lu|lwzu} %3,%2(%0)" @@ -10229,19 +10903,23 @@ (match_operand:SF 3 "gpc_reg_operand" "r,r")) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "(TARGET_SOFT_FLOAT || !TARGET_FPRS) && TARGET_UPDATE" + "(TARGET_SOFT_FLOAT || !TARGET_FPRS) && TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ {stux|stwux} %3,%0,%2 {stu|stwu} %3,%2(%0)" [(set_attr "type" "store_ux,store_u")]) (define_insn "*movdf_update1" - [(set (match_operand:DF 3 "gpc_reg_operand" "=f,f") + [(set (match_operand:DF 3 "gpc_reg_operand" "=d,d") (mem:DF (plus:SI (match_operand:SI 1 "gpc_reg_operand" "0,0") (match_operand:SI 2 "reg_or_short_operand" "r,I")))) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_UPDATE" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ lfdux %3,%0,%2 lfdu %3,%2(%0)" @@ -10250,10 +10928,12 @@ (define_insn "*movdf_update2" [(set (mem:DF (plus:SI (match_operand:SI 1 "gpc_reg_operand" "0,0") (match_operand:SI 2 "reg_or_short_operand" "r,I"))) - (match_operand:DF 3 "gpc_reg_operand" "f,f")) + (match_operand:DF 3 "gpc_reg_operand" "d,d")) (set (match_operand:SI 0 "gpc_reg_operand" "=b,b") (plus:SI (match_dup 1) (match_dup 2)))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_UPDATE" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_UPDATE + && (!avoiding_indexed_address_p (SImode) + || !gpc_reg_operand (operands[2], SImode))" "@ stfdux %3,%0,%2 stfdu %3,%2(%0)" @@ -10274,7 +10954,7 @@ (set (match_operand:DF 2 "gpc_reg_operand" "") (match_operand:DF 3 "memory_operand" ""))] "TARGET_POWER2 - && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && registers_ok_for_quad_peep (operands[0], operands[2]) && mems_ok_for_quad_peep (operands[1], operands[3])" [(set (match_dup 0) @@ -10296,7 +10976,7 @@ (set (match_operand:DF 2 "memory_operand" "") (match_operand:DF 3 "gpc_reg_operand" ""))] "TARGET_POWER2 - && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && registers_ok_for_quad_peep (operands[1], operands[3]) && mems_ok_for_quad_peep (operands[0], operands[2])" [(set (match_dup 0) @@ -10331,183 +11011,276 @@ ;; TLS support. -;; "b" output constraint here and on tls_ld to support tls linker optimization. -(define_insn "tls_gd_32" - [(set (match_operand:SI 0 "gpc_reg_operand" "=b") - (unspec:SI [(match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand:SI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSGD))] - "HAVE_AS_TLS && !TARGET_64BIT" - "addi %0,%1,%2@got@tlsgd") - -(define_insn "tls_gd_64" - [(set (match_operand:DI 0 "gpc_reg_operand" "=b") - (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "b") - (match_operand:DI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSGD))] - "HAVE_AS_TLS && TARGET_64BIT" - "addi %0,%1,%2@got@tlsgd") +;; Mode attributes for different ABIs. +(define_mode_iterator TLSmode [(SI "! TARGET_64BIT") (DI "TARGET_64BIT")]) +(define_mode_attr tls_abi_suffix [(SI "32") (DI "64")]) +(define_mode_attr tls_sysv_suffix [(SI "si") (DI "di")]) +(define_mode_attr tls_insn_suffix [(SI "wz") (DI "d")]) + +(define_insn_and_split "tls_gd_aix<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=b") + (call (mem:TLSmode (match_operand:TLSmode 3 "symbol_ref_operand" "s")) + (match_operand 4 "" "g"))) + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSGD) + (clobber (reg:SI LR_REGNO))] + "HAVE_AS_TLS && DEFAULT_ABI == ABI_AIX" + "addi %0,%1,%2@got@tlsgd\;bl %z3\;%." + "&& TARGET_TLS_MARKERS" + [(set (match_dup 0) + (unspec:TLSmode [(match_dup 1) + (match_dup 2)] + UNSPEC_TLSGD)) + (parallel [(set (match_dup 0) + (call (mem:TLSmode (match_dup 3)) + (match_dup 4))) + (unspec:TLSmode [(match_dup 2)] UNSPEC_TLSGD) + (clobber (reg:SI LR_REGNO))])] + "" + [(set_attr "type" "two") + (set_attr "length" "12")]) -(define_insn "tls_ld_32" - [(set (match_operand:SI 0 "gpc_reg_operand" "=b") - (unspec:SI [(match_operand:SI 1 "gpc_reg_operand" "b")] - UNSPEC_TLSLD))] - "HAVE_AS_TLS && !TARGET_64BIT" - "addi %0,%1,%&@got@tlsld") - -(define_insn "tls_ld_64" - [(set (match_operand:DI 0 "gpc_reg_operand" "=b") - (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "b")] - UNSPEC_TLSLD))] - "HAVE_AS_TLS && TARGET_64BIT" - "addi %0,%1,%&@got@tlsld") - -(define_insn "tls_dtprel_32" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (unspec:SI [(match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand:SI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSDTPREL))] - "HAVE_AS_TLS && !TARGET_64BIT" - "addi %0,%1,%2@dtprel") +(define_insn_and_split "tls_gd_sysv<TLSmode:tls_sysv_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=b") + (call (mem:TLSmode (match_operand:TLSmode 3 "symbol_ref_operand" "s")) + (match_operand 4 "" "g"))) + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSGD) + (clobber (reg:SI LR_REGNO))] + "HAVE_AS_TLS && DEFAULT_ABI == ABI_V4" +{ + if (flag_pic) + { + if (TARGET_SECURE_PLT && flag_pic == 2) + return "addi %0,%1,%2@got@tlsgd\;bl %z3+32768@plt"; + else + return "addi %0,%1,%2@got@tlsgd\;bl %z3@plt"; + } + else + return "addi %0,%1,%2@got@tlsgd\;bl %z3"; +} + "&& TARGET_TLS_MARKERS" + [(set (match_dup 0) + (unspec:TLSmode [(match_dup 1) + (match_dup 2)] + UNSPEC_TLSGD)) + (parallel [(set (match_dup 0) + (call (mem:TLSmode (match_dup 3)) + (match_dup 4))) + (unspec:TLSmode [(match_dup 2)] UNSPEC_TLSGD) + (clobber (reg:SI LR_REGNO))])] + "" + [(set_attr "type" "two") + (set_attr "length" "8")]) -(define_insn "tls_dtprel_64" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r") - (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "b") - (match_operand:DI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSDTPREL))] - "HAVE_AS_TLS && TARGET_64BIT" - "addi %0,%1,%2@dtprel") +(define_insn "*tls_gd<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=b") + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSGD))] + "HAVE_AS_TLS && TARGET_TLS_MARKERS" + "addi %0,%1,%2@got@tlsgd" + [(set_attr "length" "4")]) -(define_insn "tls_dtprel_ha_32" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (unspec:SI [(match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand:SI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSDTPRELHA))] - "HAVE_AS_TLS && !TARGET_64BIT" - "addis %0,%1,%2@dtprel@ha") +(define_insn "*tls_gd_call_aix<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=b") + (call (mem:TLSmode (match_operand:TLSmode 1 "symbol_ref_operand" "s")) + (match_operand 2 "" "g"))) + (unspec:TLSmode [(match_operand:TLSmode 3 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSGD) + (clobber (reg:SI LR_REGNO))] + "HAVE_AS_TLS && DEFAULT_ABI == ABI_AIX && TARGET_TLS_MARKERS" + "bl %z1(%3@tlsgd)\;%." + [(set_attr "type" "branch") + (set_attr "length" "8")]) -(define_insn "tls_dtprel_ha_64" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r") - (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "b") - (match_operand:DI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSDTPRELHA))] - "HAVE_AS_TLS && TARGET_64BIT" - "addis %0,%1,%2@dtprel@ha") +(define_insn "*tls_gd_call_sysv<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=b") + (call (mem:TLSmode (match_operand:TLSmode 1 "symbol_ref_operand" "s")) + (match_operand 2 "" "g"))) + (unspec:TLSmode [(match_operand:TLSmode 3 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSGD) + (clobber (reg:SI LR_REGNO))] + "HAVE_AS_TLS && DEFAULT_ABI == ABI_V4 && TARGET_TLS_MARKERS" +{ + if (flag_pic) + { + if (TARGET_SECURE_PLT && flag_pic == 2) + return "bl %z1+32768(%3@tlsgd)@plt"; + return "bl %z1(%3@tlsgd)@plt"; + } + return "bl %z1(%3@tlsgd)"; +} + [(set_attr "type" "branch") + (set_attr "length" "4")]) -(define_insn "tls_dtprel_lo_32" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (unspec:SI [(match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand:SI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSDTPRELLO))] - "HAVE_AS_TLS && !TARGET_64BIT" - "addi %0,%1,%2@dtprel@l") +(define_insn_and_split "tls_ld_aix<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=b") + (call (mem:TLSmode (match_operand:TLSmode 2 "symbol_ref_operand" "s")) + (match_operand 3 "" "g"))) + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b")] + UNSPEC_TLSLD) + (clobber (reg:SI LR_REGNO))] + "HAVE_AS_TLS && DEFAULT_ABI == ABI_AIX" + "addi %0,%1,%&@got@tlsld\;bl %z2\;%." + "&& TARGET_TLS_MARKERS" + [(set (match_dup 0) + (unspec:TLSmode [(match_dup 1)] + UNSPEC_TLSLD)) + (parallel [(set (match_dup 0) + (call (mem:TLSmode (match_dup 2)) + (match_dup 3))) + (unspec:TLSmode [(const_int 0)] UNSPEC_TLSLD) + (clobber (reg:SI LR_REGNO))])] + "" + [(set_attr "length" "12")]) -(define_insn "tls_dtprel_lo_64" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r") - (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "b") - (match_operand:DI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSDTPRELLO))] - "HAVE_AS_TLS && TARGET_64BIT" - "addi %0,%1,%2@dtprel@l") +(define_insn_and_split "tls_ld_sysv<TLSmode:tls_sysv_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=b") + (call (mem:TLSmode (match_operand:TLSmode 2 "symbol_ref_operand" "s")) + (match_operand 3 "" "g"))) + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b")] + UNSPEC_TLSLD) + (clobber (reg:SI LR_REGNO))] + "HAVE_AS_TLS && DEFAULT_ABI == ABI_V4" +{ + if (flag_pic) + { + if (TARGET_SECURE_PLT && flag_pic == 2) + return "addi %0,%1,%&@got@tlsld\;bl %z2+32768@plt"; + else + return "addi %0,%1,%&@got@tlsld\;bl %z2@plt"; + } + else + return "addi %0,%1,%&@got@tlsld\;bl %z2"; +} + "&& TARGET_TLS_MARKERS" + [(set (match_dup 0) + (unspec:TLSmode [(match_dup 1)] + UNSPEC_TLSLD)) + (parallel [(set (match_dup 0) + (call (mem:TLSmode (match_dup 2)) + (match_dup 3))) + (unspec:TLSmode [(const_int 0)] UNSPEC_TLSLD) + (clobber (reg:SI LR_REGNO))])] + "" + [(set_attr "length" "8")]) -(define_insn "tls_got_dtprel_32" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (unspec:SI [(match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand:SI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSGOTDTPREL))] - "HAVE_AS_TLS && !TARGET_64BIT" - "lwz %0,%2@got@dtprel(%1)") +(define_insn "*tls_ld<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=b") + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b")] + UNSPEC_TLSLD))] + "HAVE_AS_TLS && TARGET_TLS_MARKERS" + "addi %0,%1,%&@got@tlsld" + [(set_attr "length" "4")]) -(define_insn "tls_got_dtprel_64" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r") - (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "b") - (match_operand:DI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSGOTDTPREL))] - "HAVE_AS_TLS && TARGET_64BIT" - "ld %0,%2@got@dtprel(%1)") - -(define_insn "tls_tprel_32" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (unspec:SI [(match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand:SI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSTPREL))] - "HAVE_AS_TLS && !TARGET_64BIT" - "addi %0,%1,%2@tprel") +(define_insn "*tls_ld_call_aix<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=b") + (call (mem:TLSmode (match_operand:TLSmode 1 "symbol_ref_operand" "s")) + (match_operand 2 "" "g"))) + (unspec:TLSmode [(const_int 0)] UNSPEC_TLSLD) + (clobber (reg:SI LR_REGNO))] + "HAVE_AS_TLS && DEFAULT_ABI == ABI_AIX && TARGET_TLS_MARKERS" + "bl %z1(%&@tlsld)\;%." + [(set_attr "type" "branch") + (set_attr "length" "8")]) -(define_insn "tls_tprel_64" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r") - (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "b") - (match_operand:DI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSTPREL))] - "HAVE_AS_TLS && TARGET_64BIT" - "addi %0,%1,%2@tprel") +(define_insn "*tls_ld_call_sysv<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=b") + (call (mem:TLSmode (match_operand:TLSmode 1 "symbol_ref_operand" "s")) + (match_operand 2 "" "g"))) + (unspec:TLSmode [(const_int 0)] UNSPEC_TLSLD) + (clobber (reg:SI LR_REGNO))] + "HAVE_AS_TLS && DEFAULT_ABI == ABI_V4 && TARGET_TLS_MARKERS" +{ + if (flag_pic) + { + if (TARGET_SECURE_PLT && flag_pic == 2) + return "bl %z1+32768(%&@tlsld)@plt"; + return "bl %z1(%&@tlsld)@plt"; + } + return "bl %z1(%&@tlsld)"; +} + [(set_attr "type" "branch") + (set_attr "length" "4")]) -(define_insn "tls_tprel_ha_32" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (unspec:SI [(match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand:SI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSTPRELHA))] - "HAVE_AS_TLS && !TARGET_64BIT" - "addis %0,%1,%2@tprel@ha") +(define_insn "tls_dtprel_<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=r") + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSDTPREL))] + "HAVE_AS_TLS" + "addi %0,%1,%2@dtprel") -(define_insn "tls_tprel_ha_64" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r") - (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "b") - (match_operand:DI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSTPRELHA))] - "HAVE_AS_TLS && TARGET_64BIT" - "addis %0,%1,%2@tprel@ha") +(define_insn "tls_dtprel_ha_<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=r") + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSDTPRELHA))] + "HAVE_AS_TLS" + "addis %0,%1,%2@dtprel@ha") -(define_insn "tls_tprel_lo_32" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (unspec:SI [(match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand:SI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSTPRELLO))] - "HAVE_AS_TLS && !TARGET_64BIT" - "addi %0,%1,%2@tprel@l") +(define_insn "tls_dtprel_lo_<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=r") + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSDTPRELLO))] + "HAVE_AS_TLS" + "addi %0,%1,%2@dtprel@l") -(define_insn "tls_tprel_lo_64" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r") - (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "b") - (match_operand:DI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSTPRELLO))] - "HAVE_AS_TLS && TARGET_64BIT" +(define_insn "tls_got_dtprel_<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=r") + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSGOTDTPREL))] + "HAVE_AS_TLS" + "l<TLSmode:tls_insn_suffix> %0,%2@got@dtprel(%1)") + +(define_insn "tls_tprel_<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=r") + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSTPREL))] + "HAVE_AS_TLS" + "addi %0,%1,%2@tprel") + +(define_insn "tls_tprel_ha_<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=r") + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSTPRELHA))] + "HAVE_AS_TLS" + "addis %0,%1,%2@tprel@ha") + +(define_insn "tls_tprel_lo_<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=r") + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSTPRELLO))] + "HAVE_AS_TLS" "addi %0,%1,%2@tprel@l") ;; "b" output constraint here and on tls_tls input to support linker tls ;; optimization. The linker may edit the instructions emitted by a ;; tls_got_tprel/tls_tls pair to addis,addi. -(define_insn "tls_got_tprel_32" - [(set (match_operand:SI 0 "gpc_reg_operand" "=b") - (unspec:SI [(match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand:SI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSGOTTPREL))] - "HAVE_AS_TLS && !TARGET_64BIT" - "lwz %0,%2@got@tprel(%1)") - -(define_insn "tls_got_tprel_64" - [(set (match_operand:DI 0 "gpc_reg_operand" "=b") - (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "b") - (match_operand:DI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSGOTTPREL))] - "HAVE_AS_TLS && TARGET_64BIT" - "ld %0,%2@got@tprel(%1)") - -(define_insn "tls_tls_32" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (unspec:SI [(match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand:SI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSTLS))] - "HAVE_AS_TLS && !TARGET_64BIT" +(define_insn "tls_got_tprel_<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=b") + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSGOTTPREL))] + "HAVE_AS_TLS" + "l<TLSmode:tls_insn_suffix> %0,%2@got@tprel(%1)") + +(define_insn "tls_tls_<TLSmode:tls_abi_suffix>" + [(set (match_operand:TLSmode 0 "gpc_reg_operand" "=r") + (unspec:TLSmode [(match_operand:TLSmode 1 "gpc_reg_operand" "b") + (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")] + UNSPEC_TLSTLS))] + "HAVE_AS_TLS" "add %0,%1,%2@tls") -(define_insn "tls_tls_64" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r") - (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "b") - (match_operand:DI 2 "rs6000_tls_symbol_ref" "")] - UNSPEC_TLSTLS))] - "HAVE_AS_TLS && TARGET_64BIT" - "add %0,%1,%2@tls") ;; Next come insns related to the calling sequence. ;; @@ -10524,6 +11297,7 @@ { rtx chain = gen_reg_rtx (Pmode); rtx stack_bot = gen_rtx_MEM (Pmode, stack_pointer_rtx); rtx neg_op0; + rtx insn, par, set, mem; emit_move_insn (chain, stack_bot); @@ -10550,16 +11324,22 @@ else neg_op0 = GEN_INT (- INTVAL (operands[1])); - if (TARGET_UPDATE) - emit_insn ((* ((TARGET_32BIT) ? gen_movsi_update : gen_movdi_di_update)) - (stack_pointer_rtx, stack_pointer_rtx, neg_op0, chain)); - - else - { - emit_insn ((* ((TARGET_32BIT) ? gen_addsi3 : gen_adddi3)) - (stack_pointer_rtx, stack_pointer_rtx, neg_op0)); - emit_move_insn (gen_rtx_MEM (Pmode, stack_pointer_rtx), chain); - } + insn = emit_insn ((* ((TARGET_32BIT) ? gen_movsi_update_stack + : gen_movdi_di_update_stack)) + (stack_pointer_rtx, stack_pointer_rtx, neg_op0, + chain)); + /* Since we didn't use gen_frame_mem to generate the MEM, grab + it now and set the alias set/attributes. The above gen_*_update + calls will generate a PARALLEL with the MEM set being the first + operation. */ + par = PATTERN (insn); + gcc_assert (GET_CODE (par) == PARALLEL); + set = XVECEXP (par, 0, 0); + gcc_assert (GET_CODE (set) == SET); + mem = SET_DEST (set); + gcc_assert (MEM_P (mem)); + MEM_NOTRAP_P (mem) = 1; + set_mem_alias_set (mem, get_frame_alias_set ()); emit_move_insn (operands[0], virtual_stack_dynamic_rtx); DONE; @@ -10750,12 +11530,12 @@ #if TARGET_MACHO if (DEFAULT_ABI == ABI_DARWIN) { - const char *picbase = machopic_function_base_name (); - rtx picrtx = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (picbase)); + rtx picrtx = gen_rtx_SYMBOL_REF (Pmode, MACHOPIC_FUNCTION_BASE_NAME); rtx picreg = gen_rtx_REG (Pmode, RS6000_PIC_OFFSET_TABLE_REGNUM); rtx tmplabrtx; char tmplab[20]; + crtl->uses_pic_offset_table = 1; ASM_GENERATE_INTERNAL_LABEL(tmplab, \"LSJR\", CODE_LABEL_NUMBER (operands[0])); tmplabrtx = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (tmplab)); @@ -12191,43 +12971,44 @@ [(set (match_operand:CCFP 0 "cc_reg_operand" "=y") (compare:CCFP (match_operand:SF 1 "gpc_reg_operand" "f") (match_operand:SF 2 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" "fcmpu %0,%1,%2" [(set_attr "type" "fpcompare")]) (define_insn "*cmpdf_internal1" [(set (match_operand:CCFP 0 "cc_reg_operand" "=y") - (compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "f") - (match_operand:DF 2 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS" + (compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "d") + (match_operand:DF 2 "gpc_reg_operand" "d")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "fcmpu %0,%1,%2" [(set_attr "type" "fpcompare")]) ;; Only need to compare second words if first words equal (define_insn "*cmptf_internal1" [(set (match_operand:CCFP 0 "cc_reg_operand" "=y") - (compare:CCFP (match_operand:TF 1 "gpc_reg_operand" "f") - (match_operand:TF 2 "gpc_reg_operand" "f")))] + (compare:CCFP (match_operand:TF 1 "gpc_reg_operand" "d") + (match_operand:TF 2 "gpc_reg_operand" "d")))] "!TARGET_IEEEQUAD && !TARGET_XL_COMPAT - && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128" + && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LONG_DOUBLE_128" "fcmpu %0,%1,%2\;bne %0,$+8\;fcmpu %0,%L1,%L2" [(set_attr "type" "fpcompare") (set_attr "length" "12")]) (define_insn_and_split "*cmptf_internal2" [(set (match_operand:CCFP 0 "cc_reg_operand" "=y") - (compare:CCFP (match_operand:TF 1 "gpc_reg_operand" "f") - (match_operand:TF 2 "gpc_reg_operand" "f"))) - (clobber (match_scratch:DF 3 "=f")) - (clobber (match_scratch:DF 4 "=f")) - (clobber (match_scratch:DF 5 "=f")) - (clobber (match_scratch:DF 6 "=f")) - (clobber (match_scratch:DF 7 "=f")) - (clobber (match_scratch:DF 8 "=f")) - (clobber (match_scratch:DF 9 "=f")) - (clobber (match_scratch:DF 10 "=f"))] + (compare:CCFP (match_operand:TF 1 "gpc_reg_operand" "d") + (match_operand:TF 2 "gpc_reg_operand" "d"))) + (clobber (match_scratch:DF 3 "=d")) + (clobber (match_scratch:DF 4 "=d")) + (clobber (match_scratch:DF 5 "=d")) + (clobber (match_scratch:DF 6 "=d")) + (clobber (match_scratch:DF 7 "=d")) + (clobber (match_scratch:DF 8 "=d")) + (clobber (match_scratch:DF 9 "=d")) + (clobber (match_scratch:DF 10 "=d"))] "!TARGET_IEEEQUAD && TARGET_XL_COMPAT - && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128" + && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LONG_DOUBLE_128" "#" "&& reload_completed" [(set (match_dup 3) (match_dup 13)) @@ -12298,7 +13079,7 @@ (define_insn "move_from_CR_gt_bit" [(set (match_operand:SI 0 "gpc_reg_operand" "=r") (unspec:SI [(match_operand 1 "cc_reg_operand" "y")] UNSPEC_MV_CR_GT))] - "TARGET_E500" + "TARGET_HARD_FLOAT && !TARGET_FPRS" "mfcr %0\;{rlinm|rlwinm} %0,%0,%D1,31,31" [(set_attr "type" "mfcr") (set_attr "length" "8")]) @@ -14683,12 +15464,25 @@ "{stm|stmw} %2,%1" [(set_attr "type" "store_ux")]) +(define_insn "*save_gpregs_<mode>" + [(match_parallel 0 "any_parallel_operand" + [(clobber (reg:P 65)) + (use (match_operand:P 1 "symbol_ref_operand" "s")) + (use (match_operand:P 2 "gpc_reg_operand" "r")) + (set (match_operand:P 3 "memory_operand" "=m") + (match_operand:P 4 "gpc_reg_operand" "r"))])] + "" + "bl %z1" + [(set_attr "type" "branch") + (set_attr "length" "4")]) + (define_insn "*save_fpregs_<mode>" [(match_parallel 0 "any_parallel_operand" [(clobber (reg:P 65)) - (use (match_operand:P 1 "call_operand" "s")) - (set (match_operand:DF 2 "memory_operand" "=m") - (match_operand:DF 3 "gpc_reg_operand" "f"))])] + (use (match_operand:P 1 "symbol_ref_operand" "s")) + (use (match_operand:P 2 "gpc_reg_operand" "r")) + (set (match_operand:DF 3 "memory_operand" "=m") + (match_operand:DF 4 "gpc_reg_operand" "d"))])] "" "bl %z1" [(set_attr "type" "branch") @@ -14777,15 +15571,43 @@ ; FIXME: This would probably be somewhat simpler if the Cygnus sibcall ; stuff was in GCC. Oh, and "any_parallel_operand" is a bit flexible... +(define_insn "*restore_gpregs_<mode>" + [(match_parallel 0 "any_parallel_operand" + [(clobber (match_operand:P 1 "register_operand" "=l")) + (use (match_operand:P 2 "symbol_ref_operand" "s")) + (use (match_operand:P 3 "gpc_reg_operand" "r")) + (set (match_operand:P 4 "gpc_reg_operand" "=r") + (match_operand:P 5 "memory_operand" "m"))])] + "" + "bl %z2" + [(set_attr "type" "branch") + (set_attr "length" "4")]) + +(define_insn "*return_and_restore_gpregs_<mode>" + [(match_parallel 0 "any_parallel_operand" + [(return) + (clobber (match_operand:P 1 "register_operand" "=l")) + (use (match_operand:P 2 "symbol_ref_operand" "s")) + (use (match_operand:P 3 "gpc_reg_operand" "r")) + (set (match_operand:P 4 "gpc_reg_operand" "=r") + (match_operand:P 5 "memory_operand" "m"))])] + "" + "b %z2" + [(set_attr "type" "branch") + (set_attr "length" "4")]) + (define_insn "*return_and_restore_fpregs_<mode>" [(match_parallel 0 "any_parallel_operand" [(return) - (use (reg:P 65)) - (use (match_operand:P 1 "call_operand" "s")) - (set (match_operand:DF 2 "gpc_reg_operand" "=f") - (match_operand:DF 3 "memory_operand" "m"))])] + (clobber (match_operand:P 1 "register_operand" "=l")) + (use (match_operand:P 2 "symbol_ref_operand" "s")) + (use (match_operand:P 3 "gpc_reg_operand" "r")) + (set (match_operand:DF 4 "gpc_reg_operand" "=d") + (match_operand:DF 5 "memory_operand" "m"))])] "" - "b %z1") + "b %z2" + [(set_attr "type" "branch") + (set_attr "length" "4")]) ; This is used in compiling the unwind routines. (define_expand "eh_return" @@ -14832,8 +15654,19 @@ }" [(set_attr "type" "load")]) +(define_insn "bpermd_<mode>" + [(set (match_operand:P 0 "gpc_reg_operand" "=r") + (unspec:P [(match_operand:P 1 "gpc_reg_operand" "r") + (match_operand:P 2 "gpc_reg_operand" "r")] UNSPEC_BPERM))] + "TARGET_POPCNTD" + "bpermd %0,%1,%2" + [(set_attr "type" "integer")]) + + (include "sync.md") +(include "vector.md") +(include "vsx.md") (include "altivec.md") (include "spe.md") (include "dfp.md") Index: gcc-4.3.4-20091019/gcc/config/rs6000/rs6000.opt =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/rs6000.opt 2009-10-19 13:39:52.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/config/rs6000/rs6000.opt 2009-10-19 13:40:37.000000000 +0200 @@ -1,6 +1,6 @@ ; Options for the rs6000 port of the compiler ; -; Copyright (C) 2005, 2006, 2007 Free Software Foundation, Inc. +; Copyright (C) 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc. ; Contributed by Aldy Hernandez <aldy@quesejoda.com>. ; ; This file is part of GCC. @@ -111,28 +111,77 @@ mhard-float Target Report RejectNegative InverseMask(SOFT_FLOAT, HARD_FLOAT) Use hardware floating point -mno-update -Target Report RejectNegative Mask(NO_UPDATE) -Do not generate load/store with update instructions +mpopcntd +Target Report Mask(POPCNTD) +Use PowerPC V2.06 popcntd instruction + +mvsx +Target Report Mask(VSX) +Use vector/scalar (VSX) instructions + +mvsx-scalar-double +Target Undocumented Report Var(TARGET_VSX_SCALAR_DOUBLE) Init(-1) +; If -mvsx, use VSX arithmetic instructions for scalar double (on by default) + +mvsx-scalar-memory +Target Undocumented Report Var(TARGET_VSX_SCALAR_MEMORY) +; If -mvsx, use VSX scalar memory reference instructions for scalar double (off by default) + +mvsx-align-128 +Target Undocumented Report Var(TARGET_VSX_ALIGN_128) +; If -mvsx, set alignment to 128 bits instead of 32/64 + +; Note, enabling this on GCC 4.3 breaks calculix. It is ok on GCC 4.5. +mallow-movmisalign +Target Undocumented Var(TARGET_ALLOW_MOVMISALIGN) +; Allow/disallow the movmisalign in DF/DI vectors + +mallow-df-permute +Target Undocumented Var(TARGET_ALLOW_DF_PERMUTE) +; Allow/disallow permutation of DF/DI vectors + +msched-groups +Target Undocumented Report Var(TARGET_SCHED_GROUPS) Init(-1) +; Explicitly set/unset whether rs6000_sched_groups is set + +malways-hint +Target Undocumented Report Var(TARGET_ALWAYS_HINT) Init(-1) +; Explicitly set/unset whether rs6000_always_hint is set + +malign-branch-targets +Target Undocumented Report Var(TARGET_ALIGN_BRANCH_TARGETS) Init(-1) +; Explicitly set/unset whether rs6000_align_branch_targets is set + +; This should be enabled in the AT branch, but until we get all the bugs in +; compiling gamess, cactusADM, calculix, and wrf, don't define it. +; (gamess, cactusADM, and calculix fail in vectorizing sqrt on VSX) +; (wrf is fails for some other builtin, probably copysignf on both VSX/Altivec). +mvectorize-builtins +Target Undocumented Report Var(TARGET_VECTORIZE_BUILTINS) +; Explicitly control whether we vectorize the builtins or not. mupdate -Target Report RejectNegative InverseMask(NO_UPDATE, UPDATE) +Target Report Var(TARGET_UPDATE) Init(1) Generate load/store with update instructions -mno-fused-madd -Target Report RejectNegative Mask(NO_FUSED_MADD) -Do not generate fused multiply/add instructions +mavoid-indexed-addresses +Target Report Var(TARGET_AVOID_XFORM) Init(-1) +Avoid generation of indexed load/store instructions when possible mfused-madd -Target Report RejectNegative InverseMask(NO_FUSED_MADD, FUSED_MADD) +Target Report Var(TARGET_FUSED_MADD) Init(1) Generate fused multiply/add instructions -msched-prolog -Target Report Var(TARGET_SCHED_PROLOG) Init(1) -Schedule the start and end of the procedure +mtls-markers +Target Report Var(tls_markers) Init(1) +Mark __tls_get_addr calls with argument info msched-epilog -Target Undocumented Var(TARGET_SCHED_PROLOG) VarExists +Target Undocumented Var(TARGET_SCHED_PROLOG) Init(1) + +msched-prolog +Target Report Var(TARGET_SCHED_PROLOG) VarExists +Schedule the start and end of the procedure maix-struct-return Target Report RejectNegative Var(aix_struct_return) @@ -190,7 +239,7 @@ Target RejectNegative Joined -mvrsave=yes/no Deprecated option. Use -mvrsave/-mno-vrsave instead misel -Target Var(rs6000_isel) +Target Report Mask(ISEL) Generate isel instructions misel= @@ -198,7 +247,7 @@ Target RejectNegative Joined -misel=yes/no Deprecated option. Use -misel/-mno-isel instead mspe -Target Var(rs6000_spe) +Target Generate SPE SIMD instructions on E500 mpaired @@ -239,7 +288,7 @@ Generate Cell microcode mwarn-cell-microcode Target Var(rs6000_warn_cell_microcode) Init(0) Warning -Emitting warning when a Cell microcode is emitted +Warn when a Cell microcoded instruction is emitted mwarn-altivec-long Target Var(rs6000_warn_altivec_long) Init(1) @@ -268,3 +317,25 @@ Specify alignment of structure fields de mprioritize-restricted-insns= Target RejectNegative Joined UInteger Var(rs6000_sched_restricted_insns_priority) Specify scheduling priority for dispatch slot restricted insns + +msingle-float +Target RejectNegative Var(rs6000_single_float) +Single-precision floating point unit + +mdouble-float +Target RejectNegative Var(rs6000_double_float) +Double-precision floating point unit + +msimple-fpu +Target RejectNegative Var(rs6000_simple_fpu) +Floating point unit does not support divide & sqrt + +mfpu= +Target RejectNegative Joined +-mfpu= Specify FP (sp, dp, sp-lite, dp-lite) (implies -mxilinx-fpu) + +mxilinx-fpu +Target Var(rs6000_xilinx_fpu) +Specify Xilinx FPU. + + Index: gcc-4.3.4-20091019/gcc/config/rs6000/rs6000-protos.h =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/rs6000-protos.h 2009-10-19 13:39:52.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/config/rs6000/rs6000-protos.h 2009-10-19 13:40:37.000000000 +0200 @@ -1,5 +1,5 @@ /* Definitions of target machine for GNU compiler, for IBM RS/6000. - Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 + Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc. Contributed by Richard Kenner (kenner@vlsi1.ultra.nyu.edu) @@ -42,6 +42,7 @@ extern void validate_condition_mode (enu extern bool legitimate_constant_pool_address_p (rtx); extern bool legitimate_indirect_address_p (rtx, int); extern bool legitimate_indexed_address_p (rtx, int); +extern bool avoiding_indexed_address_p (enum machine_mode); extern rtx rs6000_got_register (rtx); extern rtx find_addr_reg (rtx); @@ -63,9 +64,20 @@ extern int insvdi_rshift_rlwimi_p (rtx, extern int registers_ok_for_quad_peep (rtx, rtx); extern int mems_ok_for_quad_peep (rtx, rtx); extern bool gpr_or_gpr_p (rtx, rtx); -extern enum reg_class rs6000_secondary_reload_class (enum reg_class, - enum machine_mode, rtx); - +extern enum reg_class (*rs6000_preferred_reload_class_ptr) (rtx, + enum reg_class); +extern enum reg_class (*rs6000_secondary_reload_class_ptr) (enum reg_class, + enum machine_mode, + rtx); +extern bool (*rs6000_secondary_memory_needed_ptr) (enum reg_class, + enum reg_class, + enum machine_mode); +extern bool (*rs6000_cannot_change_mode_class_ptr) (enum machine_mode, + enum machine_mode, + enum reg_class); +extern bool (*rs6000_legitimate_address_ptr) (enum machine_mode, rtx, bool); +extern rtx (*rs6000_legitimize_address_ptr) (rtx, rtx, enum machine_mode); +extern void rs6000_secondary_reload_inner (rtx, rtx, rtx, bool); extern int paired_emit_vector_cond_expr (rtx, rtx, rtx, rtx, rtx, rtx); extern void paired_expand_vector_move (rtx operands[]); @@ -77,6 +89,7 @@ extern int extract_ME (rtx); extern void rs6000_output_function_entry (FILE *, const char *); extern void print_operand (FILE *, rtx, int); extern void print_operand_address (FILE *, rtx); +extern bool rs6000_output_addr_const_extra (FILE *, rtx); extern enum rtx_code rs6000_reverse_condition (enum machine_mode, enum rtx_code); extern void rs6000_emit_sCOND (enum rtx_code, rtx); @@ -105,12 +118,11 @@ extern rtx create_TOC_reference (rtx); extern void rs6000_split_multireg_move (rtx, rtx); extern void rs6000_emit_move (rtx, rtx, enum machine_mode); extern rtx rs6000_secondary_memory_needed_rtx (enum machine_mode); -extern rtx rs6000_legitimize_address (rtx, rtx, enum machine_mode); -extern rtx rs6000_legitimize_reload_address (rtx, enum machine_mode, - int, int, int, int *); -extern int rs6000_legitimate_address (enum machine_mode, rtx, int); +extern rtx (*rs6000_legitimize_reload_address_ptr) (rtx, enum machine_mode, + int, int, int, int *); extern bool rs6000_legitimate_offset_address_p (enum machine_mode, rtx, int); -extern bool rs6000_mode_dependent_address (rtx); +extern bool (*rs6000_mode_dependent_address_ptr) (rtx); +extern rtx rs6000_find_base_term (rtx); extern bool rs6000_offsettable_memref_p (rtx); extern rtx rs6000_return_addr (int, rtx); extern void rs6000_output_symbol_ref (FILE*, rtx); @@ -167,7 +179,8 @@ extern int rs6000_register_move_cost (en enum reg_class, enum reg_class); extern int rs6000_memory_move_cost (enum machine_mode, enum reg_class, int); extern bool rs6000_tls_referenced_p (rtx); -extern int rs6000_hard_regno_nregs (int, enum machine_mode); +extern bool rs6000_has_indirect_jump_p (void); +extern void rs6000_set_indirect_jump (void); extern void rs6000_conditional_register_usage (void); /* Declare functions in rs6000-c.c */ @@ -179,7 +192,13 @@ extern void rs6000_cpu_cpp_builtins (str char *output_call (rtx, rtx *, int, int); #endif +#ifdef NO_DOLLAR_IN_LABEL +const char * rs6000_xcoff_strip_dollar (const char *); +#endif + void rs6000_final_prescan_insn (rtx, rtx *operand, int num_operands); extern bool rs6000_hard_regno_mode_ok_p[][FIRST_PSEUDO_REGISTER]; +extern unsigned char rs6000_class_max_nregs[][LIM_REG_CLASSES]; +extern unsigned char rs6000_hard_regno_nregs[][FIRST_PSEUDO_REGISTER]; #endif /* rs6000-protos.h */ Index: gcc-4.3.4-20091019/gcc/config/rs6000/spe.md =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/spe.md 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/spe.md 2009-10-19 13:40:37.000000000 +0200 @@ -1,5 +1,5 @@ ;; e500 SPE description -;; Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 +;; Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 ;; Free Software Foundation, Inc. ;; Contributed by Aldy Hernandez (aldy@quesejoda.com) @@ -99,7 +99,7 @@ ;; Floating point conversion instructions. -(define_insn "fixuns_truncdfsi2" +(define_insn "spe_fixuns_truncdfsi2" [(set (match_operand:SI 0 "gpc_reg_operand" "=r") (unsigned_fix:SI (match_operand:DF 1 "gpc_reg_operand" "r")))] "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE" @@ -2933,7 +2933,8 @@ [(compare:CCFP (match_operand:SF 1 "gpc_reg_operand" "r") (match_operand:SF 2 "gpc_reg_operand" "r"))] 1000))] - "TARGET_HARD_FLOAT && !TARGET_FPRS && !flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && !TARGET_FPRS + && !(flag_finite_math_only && !flag_trapping_math)" "efscmpeq %0,%1,%2" [(set_attr "type" "veccmp")]) @@ -2943,7 +2944,8 @@ [(compare:CCFP (match_operand:SF 1 "gpc_reg_operand" "r") (match_operand:SF 2 "gpc_reg_operand" "r"))] 1001))] - "TARGET_HARD_FLOAT && !TARGET_FPRS && flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && !TARGET_FPRS + && flag_finite_math_only && !flag_trapping_math" "efststeq %0,%1,%2" [(set_attr "type" "veccmpsimple")]) @@ -2953,7 +2955,8 @@ [(compare:CCFP (match_operand:SF 1 "gpc_reg_operand" "r") (match_operand:SF 2 "gpc_reg_operand" "r"))] 1002))] - "TARGET_HARD_FLOAT && !TARGET_FPRS && !flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && !TARGET_FPRS + && !(flag_finite_math_only && !flag_trapping_math)" "efscmpgt %0,%1,%2" [(set_attr "type" "veccmp")]) @@ -2963,7 +2966,8 @@ [(compare:CCFP (match_operand:SF 1 "gpc_reg_operand" "r") (match_operand:SF 2 "gpc_reg_operand" "r"))] 1003))] - "TARGET_HARD_FLOAT && !TARGET_FPRS && flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && !TARGET_FPRS + && flag_finite_math_only && !flag_trapping_math" "efststgt %0,%1,%2" [(set_attr "type" "veccmpsimple")]) @@ -2973,7 +2977,8 @@ [(compare:CCFP (match_operand:SF 1 "gpc_reg_operand" "r") (match_operand:SF 2 "gpc_reg_operand" "r"))] 1004))] - "TARGET_HARD_FLOAT && !TARGET_FPRS && !flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && !TARGET_FPRS + && !(flag_finite_math_only && !flag_trapping_math)" "efscmplt %0,%1,%2" [(set_attr "type" "veccmp")]) @@ -2983,7 +2988,8 @@ [(compare:CCFP (match_operand:SF 1 "gpc_reg_operand" "r") (match_operand:SF 2 "gpc_reg_operand" "r"))] 1005))] - "TARGET_HARD_FLOAT && !TARGET_FPRS && flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && !TARGET_FPRS + && flag_finite_math_only && !flag_trapping_math" "efststlt %0,%1,%2" [(set_attr "type" "veccmpsimple")]) @@ -2995,7 +3001,8 @@ [(compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "r") (match_operand:DF 2 "gpc_reg_operand" "r"))] CMPDFEQ_GPR))] - "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && !flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE + && !(flag_finite_math_only && !flag_trapping_math)" "efdcmpeq %0,%1,%2" [(set_attr "type" "veccmp")]) @@ -3005,7 +3012,8 @@ [(compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "r") (match_operand:DF 2 "gpc_reg_operand" "r"))] TSTDFEQ_GPR))] - "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE + && flag_finite_math_only && !flag_trapping_math" "efdtsteq %0,%1,%2" [(set_attr "type" "veccmpsimple")]) @@ -3015,7 +3023,8 @@ [(compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "r") (match_operand:DF 2 "gpc_reg_operand" "r"))] CMPDFGT_GPR))] - "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && !flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE + && !(flag_finite_math_only && !flag_trapping_math)" "efdcmpgt %0,%1,%2" [(set_attr "type" "veccmp")]) @@ -3025,7 +3034,8 @@ [(compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "r") (match_operand:DF 2 "gpc_reg_operand" "r"))] TSTDFGT_GPR))] - "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE + && flag_finite_math_only && !flag_trapping_math" "efdtstgt %0,%1,%2" [(set_attr "type" "veccmpsimple")]) @@ -3035,7 +3045,8 @@ [(compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "r") (match_operand:DF 2 "gpc_reg_operand" "r"))] CMPDFLT_GPR))] - "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && !flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE + && !(flag_finite_math_only && !flag_trapping_math)" "efdcmplt %0,%1,%2" [(set_attr "type" "veccmp")]) @@ -3045,7 +3056,8 @@ [(compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "r") (match_operand:DF 2 "gpc_reg_operand" "r"))] TSTDFLT_GPR))] - "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && flag_unsafe_math_optimizations" + "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE + && flag_finite_math_only && !flag_trapping_math" "efdtstlt %0,%1,%2" [(set_attr "type" "veccmpsimple")]) @@ -3059,7 +3071,7 @@ CMPTFEQ_GPR))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && TARGET_LONG_DOUBLE_128 - && !flag_unsafe_math_optimizations" + && !(flag_finite_math_only && !flag_trapping_math)" "efdcmpeq %0,%1,%2\;bng %0,$+8\;efdcmpeq %0,%L1,%L2" [(set_attr "type" "veccmp") (set_attr "length" "12")]) @@ -3072,7 +3084,7 @@ TSTTFEQ_GPR))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && TARGET_LONG_DOUBLE_128 - && flag_unsafe_math_optimizations" + && flag_finite_math_only && !flag_trapping_math" "efdtsteq %0,%1,%2\;bng %0,$+8\;efdtsteq %0,%L1,%L2" [(set_attr "type" "veccmpsimple") (set_attr "length" "12")]) @@ -3085,7 +3097,7 @@ CMPTFGT_GPR))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && TARGET_LONG_DOUBLE_128 - && !flag_unsafe_math_optimizations" + && !(flag_finite_math_only && !flag_trapping_math)" "efdcmpgt %0,%1,%2\;bgt %0,$+16\;efdcmpeq %0,%1,%2\;bng %0,$+8\;efdcmpgt %0,%L1,%L2" [(set_attr "type" "veccmp") (set_attr "length" "20")]) @@ -3098,7 +3110,7 @@ TSTTFGT_GPR))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && TARGET_LONG_DOUBLE_128 - && flag_unsafe_math_optimizations" + && flag_finite_math_only && !flag_trapping_math" "efdtstgt %0,%1,%2\;bgt %0,$+16\;efdtsteq %0,%1,%2\;bng %0,$+8\;efdtstgt %0,%L1,%L2" [(set_attr "type" "veccmpsimple") (set_attr "length" "20")]) @@ -3111,7 +3123,7 @@ CMPTFLT_GPR))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && TARGET_LONG_DOUBLE_128 - && !flag_unsafe_math_optimizations" + && !(flag_finite_math_only && !flag_trapping_math)" "efdcmplt %0,%1,%2\;bgt %0,$+16\;efdcmpeq %0,%1,%2\;bng %0,$+8\;efdcmplt %0,%L1,%L2" [(set_attr "type" "veccmp") (set_attr "length" "20")]) @@ -3124,7 +3136,7 @@ TSTTFLT_GPR))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_E500_DOUBLE && TARGET_LONG_DOUBLE_128 - && flag_unsafe_math_optimizations" + && flag_finite_math_only && !flag_trapping_math" "efdtstlt %0,%1,%2\;bgt %0,$+16\;efdtsteq %0,%1,%2\;bng %0,$+8\;efdtstlt %0,%L1,%L2" [(set_attr "type" "veccmpsimple") (set_attr "length" "20")]) @@ -3135,6 +3147,44 @@ (unspec:CCFP [(match_operand 1 "cc_reg_operand" "y") (match_operand 2 "cc_reg_operand" "y")] E500_CR_IOR_COMPARE))] - "TARGET_E500" + "TARGET_HARD_FLOAT && !TARGET_FPRS" "cror 4*%0+gt,4*%1+gt,4*%2+gt" [(set_attr "type" "cr_logical")]) + +;; Out-of-line prologues and epilogues. +(define_insn "*save_gpregs_spe" + [(match_parallel 0 "any_parallel_operand" + [(clobber (reg:P 65)) + (use (match_operand:P 1 "symbol_ref_operand" "s")) + (use (match_operand:P 2 "gpc_reg_operand" "r")) + (set (match_operand:V2SI 3 "memory_operand" "=m") + (match_operand:V2SI 4 "gpc_reg_operand" "r"))])] + "TARGET_SPE_ABI" + "bl %z1" + [(set_attr "type" "branch") + (set_attr "length" "4")]) + +(define_insn "*restore_gpregs_spe" + [(match_parallel 0 "any_parallel_operand" + [(clobber (reg:P 65)) + (use (match_operand:P 1 "symbol_ref_operand" "s")) + (use (match_operand:P 2 "gpc_reg_operand" "r")) + (set (match_operand:V2SI 3 "gpc_reg_operand" "=r") + (match_operand:V2SI 4 "memory_operand" "m"))])] + "TARGET_SPE_ABI" + "bl %z1" + [(set_attr "type" "branch") + (set_attr "length" "4")]) + +(define_insn "*return_and_restore_gpregs_spe" + [(match_parallel 0 "any_parallel_operand" + [(return) + (clobber (reg:P 65)) + (use (match_operand:P 1 "symbol_ref_operand" "s")) + (use (match_operand:P 2 "gpc_reg_operand" "r")) + (set (match_operand:V2SI 3 "gpc_reg_operand" "=r") + (match_operand:V2SI 4 "memory_operand" "m"))])] + "TARGET_SPE_ABI" + "b %z1" + [(set_attr "type" "branch") + (set_attr "length" "4")]) Index: gcc-4.3.4-20091019/gcc/config/rs6000/sysv4.h =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/sysv4.h 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/sysv4.h 2009-10-19 13:40:37.000000000 +0200 @@ -1,6 +1,6 @@ /* Target definitions for GNU compiler for PowerPC running System V.4 Copyright (C) 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, - 2004, 2005, 2006, 2007 Free Software Foundation, Inc. + 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. Contributed by Cygnus Support. This file is part of GCC. @@ -54,6 +54,7 @@ extern enum rs6000_sdata_type rs6000_sda #define TARGET_BITFIELD_TYPE (! TARGET_NO_BITFIELD_TYPE) #define TARGET_BIG_ENDIAN (! TARGET_LITTLE_ENDIAN) +#define TARGET_PROTOTYPE target_prototype #define TARGET_NO_PROTOTYPE (! TARGET_PROTOTYPE) #define TARGET_NO_TOC (! TARGET_TOC) #define TARGET_NO_EABI (! TARGET_EABI) @@ -119,9 +120,9 @@ do { \ else if (!strcmp (rs6000_abi_name, "i960-old")) \ { \ rs6000_current_abi = ABI_V4; \ - target_flags |= (MASK_LITTLE_ENDIAN | MASK_EABI \ - | MASK_NO_BITFIELD_WORD); \ + target_flags |= (MASK_LITTLE_ENDIAN | MASK_EABI); \ target_flags &= ~MASK_STRICT_ALIGN; \ + TARGET_NO_BITFIELD_WORD = 1; \ } \ else \ { \ @@ -266,19 +267,27 @@ do { \ #endif /* Define cutoff for using external functions to save floating point. - Currently on V.4, always use inline stores. */ -#define FP_SAVE_INLINE(FIRST_REG) ((FIRST_REG) < 64) + Currently on 64-bit V.4, always use inline stores. When optimizing + for size on 32-bit targets, use external functions when + profitable. */ +#define FP_SAVE_INLINE(FIRST_REG) (optimize_size && !TARGET_64BIT \ + ? ((FIRST_REG) == 62 \ + || (FIRST_REG) == 63) \ + : (FIRST_REG) < 64) +/* And similarly for general purpose registers. */ +#define GP_SAVE_INLINE(FIRST_REG) ((FIRST_REG) < 32 \ + && (TARGET_64BIT || !optimize_size)) /* Put jump tables in read-only memory, rather than in .text. */ #define JUMP_TABLES_IN_TEXT_SECTION 0 /* Prefix and suffix to use to saving floating point. */ #define SAVE_FP_PREFIX "_savefpr_" -#define SAVE_FP_SUFFIX "_l" +#define SAVE_FP_SUFFIX (TARGET_64BIT ? "_l" : "") /* Prefix and suffix to use to restoring floating point. */ #define RESTORE_FP_PREFIX "_restfpr_" -#define RESTORE_FP_SUFFIX "_l" +#define RESTORE_FP_SUFFIX (TARGET_64BIT ? "_l" : "") /* Type used for ptrdiff_t, as a string used in a declaration. */ #define PTRDIFF_TYPE "int" @@ -655,7 +664,6 @@ extern int fixuplabelno; myellowknife : %(link_start_yellowknife) ; \ mmvme : %(link_start_mvme) ; \ msim : %(link_start_sim) ; \ - mwindiss : %(link_start_windiss) ; \ mcall-freebsd: %(link_start_freebsd) ; \ mcall-linux : %(link_start_linux) ; \ mcall-gnu : %(link_start_gnu) ; \ @@ -713,7 +721,6 @@ extern int fixuplabelno; myellowknife : %(link_os_yellowknife) ; \ mmvme : %(link_os_mvme) ; \ msim : %(link_os_sim) ; \ - mwindiss : %(link_os_windiss) ; \ mcall-freebsd: %(link_os_freebsd) ; \ mcall-linux : %(link_os_linux) ; \ mcall-gnu : %(link_os_gnu) ; \ @@ -723,6 +730,9 @@ extern int fixuplabelno; #define LINK_OS_DEFAULT_SPEC "" +#define DRIVER_SELF_SPECS "%{mfpu=none: %<mfpu=* \ + %<msingle-float %<mdouble-float}" + /* Override rs6000.h definition. */ #undef CPP_SPEC #define CPP_SPEC "%{posix: -D_POSIX_SOURCE} \ @@ -730,7 +740,6 @@ extern int fixuplabelno; myellowknife : %(cpp_os_yellowknife) ; \ mmvme : %(cpp_os_mvme) ; \ msim : %(cpp_os_sim) ; \ - mwindiss : %(cpp_os_windiss) ; \ mcall-freebsd: %(cpp_os_freebsd) ; \ mcall-linux : %(cpp_os_linux) ; \ mcall-gnu : %(cpp_os_gnu) ; \ @@ -747,7 +756,6 @@ extern int fixuplabelno; myellowknife : %(startfile_yellowknife) ; \ mmvme : %(startfile_mvme) ; \ msim : %(startfile_sim) ; \ - mwindiss : %(startfile_windiss) ; \ mcall-freebsd: %(startfile_freebsd) ; \ mcall-linux : %(startfile_linux) ; \ mcall-gnu : %(startfile_gnu) ; \ @@ -764,7 +772,6 @@ extern int fixuplabelno; myellowknife : %(lib_yellowknife) ; \ mmvme : %(lib_mvme) ; \ msim : %(lib_sim) ; \ - mwindiss : %(lib_windiss) ; \ mcall-freebsd: %(lib_freebsd) ; \ mcall-linux : %(lib_linux) ; \ mcall-gnu : %(lib_gnu) ; \ @@ -777,19 +784,18 @@ extern int fixuplabelno; /* Override svr4.h definition. */ #undef ENDFILE_SPEC #define ENDFILE_SPEC "\ -%{mads : crtsavres.o%s %(endfile_ads) ; \ - myellowknife : crtsavres.o%s %(endfile_yellowknife) ; \ - mmvme : crtsavres.o%s %(endfile_mvme) ; \ - msim : crtsavres.o%s %(endfile_sim) ; \ - mwindiss : %(endfile_windiss) ; \ - mcall-freebsd: crtsavres.o%s %(endfile_freebsd) ; \ - mcall-linux : crtsavres.o%s %(endfile_linux) ; \ - mcall-gnu : crtsavres.o%s %(endfile_gnu) ; \ - mcall-netbsd : crtsavres.o%s %(endfile_netbsd) ; \ - mcall-openbsd: crtsavres.o%s %(endfile_openbsd) ; \ +%{mads : %(endfile_ads) ; \ + myellowknife : %(endfile_yellowknife) ; \ + mmvme : %(endfile_mvme) ; \ + msim : %(endfile_sim) ; \ + mcall-freebsd: %(endfile_freebsd) ; \ + mcall-linux : %(endfile_linux) ; \ + mcall-gnu : %(endfile_gnu) ; \ + mcall-netbsd : %(endfile_netbsd) ; \ + mcall-openbsd: %(endfile_openbsd) ; \ : %(crtsavres_default) %(endfile_default) }" -#define CRTSAVRES_DEFAULT_SPEC "crtsavres.o%s" +#define CRTSAVRES_DEFAULT_SPEC "" #define ENDFILE_DEFAULT_SPEC "crtend.o%s ecrtn.o%s" @@ -992,25 +998,6 @@ ncrtn.o%s" #define CPP_OS_OPENBSD_SPEC "%{posix:-D_POSIX_SOURCE} %{pthread:-D_POSIX_THREADS}" #endif -/* WindISS support. */ - -#define LIB_WINDISS_SPEC "--start-group -li -lcfp -lwindiss -lram -limpl -limpfp --end-group" - -#define CPP_OS_WINDISS_SPEC "\ --D__rtasim \ --D__EABI__ \ --D__ppc \ -%{!msoft-float: -D__hardfp} \ -" - -#define STARTFILE_WINDISS_SPEC "crt0.o%s crtbegin.o%s" - -#define ENDFILE_WINDISS_SPEC "crtend.o%s" - -#define LINK_START_WINDISS_SPEC "" - -#define LINK_OS_WINDISS_SPEC "" - /* Define any extra SPECS that the compiler needs to generate. */ /* Override rs6000.h definition. */ #undef SUBTARGET_EXTRA_SPECS @@ -1025,7 +1012,6 @@ ncrtn.o%s" { "lib_linux", LIB_LINUX_SPEC }, \ { "lib_netbsd", LIB_NETBSD_SPEC }, \ { "lib_openbsd", LIB_OPENBSD_SPEC }, \ - { "lib_windiss", LIB_WINDISS_SPEC }, \ { "lib_default", LIB_DEFAULT_SPEC }, \ { "startfile_ads", STARTFILE_ADS_SPEC }, \ { "startfile_yellowknife", STARTFILE_YELLOWKNIFE_SPEC }, \ @@ -1036,7 +1022,6 @@ ncrtn.o%s" { "startfile_linux", STARTFILE_LINUX_SPEC }, \ { "startfile_netbsd", STARTFILE_NETBSD_SPEC }, \ { "startfile_openbsd", STARTFILE_OPENBSD_SPEC }, \ - { "startfile_windiss", STARTFILE_WINDISS_SPEC }, \ { "startfile_default", STARTFILE_DEFAULT_SPEC }, \ { "endfile_ads", ENDFILE_ADS_SPEC }, \ { "endfile_yellowknife", ENDFILE_YELLOWKNIFE_SPEC }, \ @@ -1047,7 +1032,6 @@ ncrtn.o%s" { "endfile_linux", ENDFILE_LINUX_SPEC }, \ { "endfile_netbsd", ENDFILE_NETBSD_SPEC }, \ { "endfile_openbsd", ENDFILE_OPENBSD_SPEC }, \ - { "endfile_windiss", ENDFILE_WINDISS_SPEC }, \ { "endfile_default", ENDFILE_DEFAULT_SPEC }, \ { "link_path", LINK_PATH_SPEC }, \ { "link_shlib", LINK_SHLIB_SPEC }, \ @@ -1062,7 +1046,6 @@ ncrtn.o%s" { "link_start_linux", LINK_START_LINUX_SPEC }, \ { "link_start_netbsd", LINK_START_NETBSD_SPEC }, \ { "link_start_openbsd", LINK_START_OPENBSD_SPEC }, \ - { "link_start_windiss", LINK_START_WINDISS_SPEC }, \ { "link_start_default", LINK_START_DEFAULT_SPEC }, \ { "link_os", LINK_OS_SPEC }, \ { "link_os_ads", LINK_OS_ADS_SPEC }, \ @@ -1074,7 +1057,6 @@ ncrtn.o%s" { "link_os_gnu", LINK_OS_GNU_SPEC }, \ { "link_os_netbsd", LINK_OS_NETBSD_SPEC }, \ { "link_os_openbsd", LINK_OS_OPENBSD_SPEC }, \ - { "link_os_windiss", LINK_OS_WINDISS_SPEC }, \ { "link_os_default", LINK_OS_DEFAULT_SPEC }, \ { "cc1_endian_big", CC1_ENDIAN_BIG_SPEC }, \ { "cc1_endian_little", CC1_ENDIAN_LITTLE_SPEC }, \ @@ -1089,7 +1071,6 @@ ncrtn.o%s" { "cpp_os_linux", CPP_OS_LINUX_SPEC }, \ { "cpp_os_netbsd", CPP_OS_NETBSD_SPEC }, \ { "cpp_os_openbsd", CPP_OS_OPENBSD_SPEC }, \ - { "cpp_os_windiss", CPP_OS_WINDISS_SPEC }, \ { "cpp_os_default", CPP_OS_DEFAULT_SPEC }, \ { "fbsd_dynamic_linker", FBSD_DYNAMIC_LINKER }, \ SUBSUBTARGET_EXTRA_SPECS Index: gcc-4.3.4-20091019/gcc/config/rs6000/sysv4.opt =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/sysv4.opt 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/sysv4.opt 2009-10-19 13:40:37.000000000 +0200 @@ -1,6 +1,6 @@ ; SYSV4 options for PPC port. ; -; Copyright (C) 2005, 2007 Free Software Foundation, Inc. +; Copyright (C) 2005, 2007, 2008 Free Software Foundation, Inc. ; Contributed by Aldy Hernandez <aldy@quesejoda.com>. ; ; This file is part of GCC. @@ -32,7 +32,7 @@ Target RejectNegative Joined Specify bit size of immediate TLS offsets mbit-align -Target Report Mask(NO_BITFIELD_TYPE) +Target Report Var(TARGET_NO_BITFIELD_TYPE) Align to the base type of the bit-field mstrict-align @@ -74,7 +74,7 @@ Target RejectNegative no description yet mprototype -Target Var(TARGET_PROTOTYPE) +Target Var(target_prototype) Assume all variable arg functions are prototyped ;; FIXME: Does nothing. @@ -87,17 +87,18 @@ Target Report Mask(EABI) Use EABI mbit-word -Target Report Mask(NO_BITFIELD_WORD) +Target Report Var(TARGET_NO_BITFIELD_WORD) Allow bit-fields to cross word boundaries mregnames -Target Mask(REGNAMES) +Target Var(TARGET_REGNAMES) Use alternate register names -;; FIXME: Does nothing. +;; This option does nothing and only exists because the compiler +;; driver passes all -m* options through. msdata Target -no description yet +Use default method for sdata handling msim Target RejectNegative @@ -119,10 +120,6 @@ memb Target RejectNegative Set the PPC_EMB bit in the ELF flags header -mwindiss -Target RejectNegative -Use the WindISS simulator - mshlib Target RejectNegative no description yet Index: gcc-4.3.4-20091019/gcc/config/rs6000/t-rs6000 =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/t-rs6000 2008-02-19 10:55:53.000000000 +0100 +++ gcc-4.3.4-20091019/gcc/config/rs6000/t-rs6000 2009-10-19 13:40:37.000000000 +0200 @@ -2,6 +2,8 @@ gt-rs6000.h: s-gtype ; @true +TM_H += $(srcdir)/config/rs6000/rs6000-builtin.def + rs6000.o: $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ $(RTL_H) $(REGS_H) hard-reg-set.h \ real.h insn-config.h conditions.h insn-attr.h flags.h $(RECOG_H) \ @@ -18,3 +20,33 @@ rs6000-c.o: $(srcdir)/config/rs6000/rs60 # The rs6000 backend doesn't cause warnings in these files. insn-conditions.o-warn = + +MD_INCLUDES = $(srcdir)/config/rs6000/rios1.md \ + $(srcdir)/config/rs6000/rios2.md \ + $(srcdir)/config/rs6000/rs64.md \ + $(srcdir)/config/rs6000/mpc.md \ + $(srcdir)/config/rs6000/40x.md \ + $(srcdir)/config/rs6000/440.md \ + $(srcdir)/config/rs6000/603.md \ + $(srcdir)/config/rs6000/6xx.md \ + $(srcdir)/config/rs6000/7xx.md \ + $(srcdir)/config/rs6000/7450.md \ + $(srcdir)/config/rs6000/8540.md \ + $(srcdir)/config/rs6000/e300c2c3.md \ + $(srcdir)/config/rs6000/e500mc.md \ + $(srcdir)/config/rs6000/power4.md \ + $(srcdir)/config/rs6000/power5.md \ + $(srcdir)/config/rs6000/power6.md \ + $(srcdir)/config/rs6000/power7.md \ + $(srcdir)/config/rs6000/cell.md \ + $(srcdir)/config/rs6000/xfpu.md \ + $(srcdir)/config/rs6000/predicates.md \ + $(srcdir)/config/rs6000/constraints.md \ + $(srcdir)/config/rs6000/darwin.md \ + $(srcdir)/config/rs6000/sync.md \ + $(srcdir)/config/rs6000/vector.md \ + $(srcdir)/config/rs6000/vsx.md \ + $(srcdir)/config/rs6000/altivec.md \ + $(srcdir)/config/rs6000/spe.md \ + $(srcdir)/config/rs6000/dfp.md \ + $(srcdir)/config/rs6000/paired.md Index: gcc-4.3.4-20091019/gcc/config/rs6000/vector.md =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ gcc-4.3.4-20091019/gcc/config/rs6000/vector.md 2009-10-19 13:40:37.000000000 +0200 @@ -0,0 +1,946 @@ +;; Expander definitions for vector support between altivec & vsx. No +;; instructions are in this file, this file provides the generic vector +;; expander, and the actual vector instructions will be in altivec.md and +;; vsx.md + +;; Copyright (C) 2009 +;; Free Software Foundation, Inc. +;; Contributed by Michael Meissner <meissner@linux.vnet.ibm.com> + +;; This file is part of GCC. + +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published +;; by the Free Software Foundation; either version 3, or (at your +;; option) any later version. + +;; GCC is distributed in the hope that it will be useful, but WITHOUT +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY +;; or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public +;; License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; <http://www.gnu.org/licenses/>. + + +;; Vector int modes +(define_mode_iterator VEC_I [V16QI V8HI V4SI]) + +;; Vector float modes +(define_mode_iterator VEC_F [V4SF V2DF]) + +;; Vector arithmetic modes +(define_mode_iterator VEC_A [V16QI V8HI V4SI V4SF V2DF]) + +;; Vector modes that need alginment via permutes +(define_mode_iterator VEC_K [V16QI V8HI V4SI V4SF]) + +;; Vector logical modes +(define_mode_iterator VEC_L [V16QI V8HI V4SI V2DI V4SF V2DF TI]) + +;; Vector modes for moves. Don't do TImode here. +(define_mode_iterator VEC_M [V16QI V8HI V4SI V2DI V4SF V2DF]) + +;; Vector modes for types that don't need a realignment under VSX +(define_mode_iterator VEC_N [V4SI V4SF V2DI V2DF]) + +;; Vector comparison modes +(define_mode_iterator VEC_C [V16QI V8HI V4SI V4SF V2DF]) + +;; Vector init/extract modes +(define_mode_iterator VEC_E [V16QI V8HI V4SI V2DI V4SF V2DF]) + +;; Vector reload iterator +(define_mode_iterator VEC_R [V16QI V8HI V4SI V2DI V4SF V2DF DF TI]) + +;; Base type from vector mode +(define_mode_attr VEC_base [(V16QI "QI") + (V8HI "HI") + (V4SI "SI") + (V2DI "DI") + (V4SF "SF") + (V2DF "DF") + (TI "TI")]) + +;; Same size integer type for floating point data +(define_mode_attr VEC_int [(V4SF "v4si") + (V2DF "v2di")]) + +(define_mode_attr VEC_INT [(V4SF "V4SI") + (V2DF "V2DI")]) + +;; constants for unspec +(define_constants + [(UNSPEC_PREDICATE 400)]) + + +;; Vector move instructions. +(define_expand "mov<mode>" + [(set (match_operand:VEC_M 0 "nonimmediate_operand" "") + (match_operand:VEC_M 1 "any_operand" ""))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" +{ + if (can_create_pseudo_p ()) + { + if (CONSTANT_P (operands[1]) + && !easy_vector_constant (operands[1], <MODE>mode)) + operands[1] = force_const_mem (<MODE>mode, operands[1]); + + else if (!vlogical_operand (operands[0], <MODE>mode) + && !vlogical_operand (operands[1], <MODE>mode)) + operands[1] = force_reg (<MODE>mode, operands[1]); + } +}) + +;; Generic vector floating point load/store instructions. These will match +;; insns defined in vsx.md or altivec.md depending on the switches. +(define_expand "vector_load_<mode>" + [(set (match_operand:VEC_M 0 "vfloat_operand" "") + (match_operand:VEC_M 1 "memory_operand" ""))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_store_<mode>" + [(set (match_operand:VEC_M 0 "memory_operand" "") + (match_operand:VEC_M 1 "vfloat_operand" ""))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +;; Splits if a GPR register was chosen for the move +(define_split + [(set (match_operand:VEC_L 0 "nonimmediate_operand" "") + (match_operand:VEC_L 1 "input_operand" ""))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode) + && reload_completed + && gpr_or_gpr_p (operands[0], operands[1])" + [(pc)] +{ + rs6000_split_multireg_move (operands[0], operands[1]); + DONE; +}) + + +;; Reload patterns for vector operations. We may need an addtional base +;; register to convert the reg+offset addressing to reg+reg for vector +;; registers and reg+reg or (reg+reg)&(-16) addressing to just an index +;; register for gpr registers. +(define_expand "reload_<VEC_R:mode>_<P:mptrsize>_store" + [(parallel [(match_operand:VEC_R 0 "memory_operand" "m") + (match_operand:VEC_R 1 "gpc_reg_operand" "r") + (match_operand:P 2 "register_operand" "=&b")])] + "<P:tptrsize>" +{ + rs6000_secondary_reload_inner (operands[1], operands[0], operands[2], true); + DONE; +}) + +(define_expand "reload_<VEC_R:mode>_<P:mptrsize>_load" + [(parallel [(match_operand:VEC_R 0 "gpc_reg_operand" "=&r") + (match_operand:VEC_R 1 "memory_operand" "m") + (match_operand:P 2 "register_operand" "=&b")])] + "<P:tptrsize>" +{ + rs6000_secondary_reload_inner (operands[0], operands[1], operands[2], false); + DONE; +}) + +;; Reload sometimes tries to move the address to a GPR, and can generate +;; invalid RTL for addresses involving AND -16. Allow addresses involving +;; reg+reg, reg+small constant, or just reg, all wrapped in an AND -16. + +(define_insn_and_split "*vec_reload_and_plus_<mptrsize>" + [(set (match_operand:P 0 "gpc_reg_operand" "=b") + (and:P (plus:P (match_operand:P 1 "gpc_reg_operand" "r") + (match_operand:P 2 "reg_or_cint_operand" "rI")) + (const_int -16)))] + "(TARGET_ALTIVEC || TARGET_VSX) && (reload_in_progress || reload_completed)" + "#" + "&& reload_completed" + [(set (match_dup 0) + (plus:P (match_dup 1) + (match_dup 2))) + (parallel [(set (match_dup 0) + (and:P (match_dup 0) + (const_int -16))) + (clobber:CC (scratch:CC))])]) + +;; The normal ANDSI3/ANDDI3 won't match if reload decides to move an AND -16 +;; address to a register because there is no clobber of a (scratch), so we add +;; it here. +(define_insn_and_split "*vec_reload_and_reg_<mptrsize>" + [(set (match_operand:P 0 "gpc_reg_operand" "=b") + (and:P (match_operand:P 1 "gpc_reg_operand" "r") + (const_int -16)))] + "(TARGET_ALTIVEC || TARGET_VSX) && (reload_in_progress || reload_completed)" + "#" + "&& reload_completed" + [(parallel [(set (match_dup 0) + (and:P (match_dup 1) + (const_int -16))) + (clobber:CC (scratch:CC))])]) + +;; Generic floating point vector arithmetic support +(define_expand "add<mode>3" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (plus:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "") + (match_operand:VEC_F 2 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "sub<mode>3" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (minus:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "") + (match_operand:VEC_F 2 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "mul<mode>3" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (mult:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "") + (match_operand:VEC_F 2 "vfloat_operand" "")))] + "(VECTOR_UNIT_VSX_P (<MODE>mode) + || (VECTOR_UNIT_ALTIVEC_P (<MODE>mode) && TARGET_FUSED_MADD))" + " +{ + if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode)) + { + emit_insn (gen_altivec_mulv4sf3 (operands[0], operands[1], operands[2])); + DONE; + } +}") + +(define_expand "div<mode>3" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (div:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "") + (match_operand:VEC_F 2 "vfloat_operand" "")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "") + +(define_expand "neg<mode>2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (neg:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + " +{ + if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode)) + { + emit_insn (gen_altivec_negv4sf2 (operands[0], operands[1])); + DONE; + } +}") + +(define_expand "abs<mode>2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (abs:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + " +{ + if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode)) + { + emit_insn (gen_altivec_absv4sf2 (operands[0], operands[1])); + DONE; + } +}") + +(define_expand "smin<mode>3" + [(set (match_operand:VEC_F 0 "register_operand" "") + (smin:VEC_F (match_operand:VEC_F 1 "register_operand" "") + (match_operand:VEC_F 2 "register_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "smax<mode>3" + [(set (match_operand:VEC_F 0 "register_operand" "") + (smax:VEC_F (match_operand:VEC_F 1 "register_operand" "") + (match_operand:VEC_F 2 "register_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + + +(define_expand "sqrt<mode>2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (sqrt:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "") + +(define_expand "ftrunc<mode>2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (fix:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_ceil<mode>2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (unspec:VEC_F [(match_operand:VEC_F 1 "vfloat_operand" "")] + UNSPEC_FRIP))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_floor<mode>2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (unspec:VEC_F [(match_operand:VEC_F 1 "vfloat_operand" "")] + UNSPEC_FRIM))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_btrunc<mode>2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (fix:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_copysign<mode>3" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (if_then_else:VEC_F + (ge:VEC_F (match_operand:VEC_F 2 "vfloat_operand" "") + (match_dup 3)) + (abs:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")) + (neg:VEC_F (abs:VEC_F (match_dup 1)))))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + " +{ + if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode)) + { + emit_insn (gen_altivec_copysign_v4sf3 (operands[0], operands[1], + operands[2])); + DONE; + } + + operands[3] = CONST0_RTX (<MODE>mode); +}") + + +;; Vector comparisons +(define_expand "vcond<mode>" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (if_then_else:VEC_F + (match_operator 3 "comparison_operator" + [(match_operand:VEC_F 4 "vfloat_operand" "") + (match_operand:VEC_F 5 "vfloat_operand" "")]) + (match_operand:VEC_F 1 "vfloat_operand" "") + (match_operand:VEC_F 2 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + " +{ + if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], + operands[3], operands[4], operands[5])) + DONE; + else + FAIL; +}") + +(define_expand "vcond<mode>" + [(set (match_operand:VEC_I 0 "vint_operand" "") + (if_then_else:VEC_I + (match_operator 3 "comparison_operator" + [(match_operand:VEC_I 4 "vint_operand" "") + (match_operand:VEC_I 5 "vint_operand" "")]) + (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)" + " +{ + if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], + operands[3], operands[4], operands[5])) + DONE; + else + FAIL; +}") + +(define_expand "vcondu<mode>" + [(set (match_operand:VEC_I 0 "vint_operand" "") + (if_then_else:VEC_I + (match_operator 3 "comparison_operator" + [(match_operand:VEC_I 4 "vint_operand" "") + (match_operand:VEC_I 5 "vint_operand" "")]) + (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)" + " +{ + if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], + operands[3], operands[4], operands[5])) + DONE; + else + FAIL; +}") + +(define_expand "vector_eq<mode>" + [(set (match_operand:VEC_C 0 "vlogical_operand" "") + (eq:VEC_C (match_operand:VEC_C 1 "vlogical_operand" "") + (match_operand:VEC_C 2 "vlogical_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_gt<mode>" + [(set (match_operand:VEC_C 0 "vlogical_operand" "") + (gt:VEC_C (match_operand:VEC_C 1 "vlogical_operand" "") + (match_operand:VEC_C 2 "vlogical_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_ge<mode>" + [(set (match_operand:VEC_C 0 "vlogical_operand" "") + (ge:VEC_C (match_operand:VEC_C 1 "vlogical_operand" "") + (match_operand:VEC_C 2 "vlogical_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_gtu<mode>" + [(set (match_operand:VEC_I 0 "vint_operand" "") + (gtu:VEC_I (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)" + "") + +(define_expand "vector_geu<mode>" + [(set (match_operand:VEC_I 0 "vint_operand" "") + (geu:VEC_I (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)" + "") + +;; Note the arguments for __builtin_altivec_vsel are op2, op1, mask +;; which is in the reverse order that we want +(define_expand "vector_select_<mode>" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (if_then_else:VEC_L + (ne:CC (match_operand:VEC_L 3 "vlogical_operand" "") + (const_int 0)) + (match_operand:VEC_L 2 "vlogical_operand" "") + (match_operand:VEC_L 1 "vlogical_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_select_<mode>_uns" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (if_then_else:VEC_L + (ne:CCUNS (match_operand:VEC_L 3 "vlogical_operand" "") + (const_int 0)) + (match_operand:VEC_L 2 "vlogical_operand" "") + (match_operand:VEC_L 1 "vlogical_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +;; Expansions that compare vectors producing a vector result and a predicate, +;; setting CR6 to indicate a combined status +(define_expand "vector_eq_<mode>_p" + [(parallel + [(set (reg:CC 74) + (unspec:CC [(eq:CC (match_operand:VEC_A 1 "vlogical_operand" "") + (match_operand:VEC_A 2 "vlogical_operand" ""))] + UNSPEC_PREDICATE)) + (set (match_operand:VEC_A 0 "vlogical_operand" "") + (eq:VEC_A (match_dup 1) + (match_dup 2)))])] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_gt_<mode>_p" + [(parallel + [(set (reg:CC 74) + (unspec:CC [(gt:CC (match_operand:VEC_A 1 "vlogical_operand" "") + (match_operand:VEC_A 2 "vlogical_operand" ""))] + UNSPEC_PREDICATE)) + (set (match_operand:VEC_A 0 "vlogical_operand" "") + (gt:VEC_A (match_dup 1) + (match_dup 2)))])] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_ge_<mode>_p" + [(parallel + [(set (reg:CC 74) + (unspec:CC [(ge:CC (match_operand:VEC_F 1 "vfloat_operand" "") + (match_operand:VEC_F 2 "vfloat_operand" ""))] + UNSPEC_PREDICATE)) + (set (match_operand:VEC_F 0 "vfloat_operand" "") + (ge:VEC_F (match_dup 1) + (match_dup 2)))])] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "vector_gtu_<mode>_p" + [(parallel + [(set (reg:CC 74) + (unspec:CC [(gtu:CC (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" ""))] + UNSPEC_PREDICATE)) + (set (match_operand:VEC_I 0 "vlogical_operand" "") + (gtu:VEC_I (match_dup 1) + (match_dup 2)))])] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +;; AltiVec/VSX predicates. + +(define_expand "cr6_test_for_zero" + [(set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (reg:CC 74) + (const_int 0)))] + "TARGET_ALTIVEC || TARGET_VSX" + "") + +(define_expand "cr6_test_for_zero_reverse" + [(set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (reg:CC 74) + (const_int 0))) + (set (match_dup 0) (minus:SI (const_int 1) (match_dup 0)))] + "TARGET_ALTIVEC || TARGET_VSX" + "") + +(define_expand "cr6_test_for_lt" + [(set (match_operand:SI 0 "register_operand" "=r") + (lt:SI (reg:CC 74) + (const_int 0)))] + "TARGET_ALTIVEC || TARGET_VSX" + "") + +(define_expand "cr6_test_for_lt_reverse" + [(set (match_operand:SI 0 "register_operand" "=r") + (lt:SI (reg:CC 74) + (const_int 0))) + (set (match_dup 0) (minus:SI (const_int 1) (match_dup 0)))] + "TARGET_ALTIVEC || TARGET_VSX" + "") + + +;; Vector logical instructions +(define_expand "xor<mode>3" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (xor:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "") + (match_operand:VEC_L 2 "vlogical_operand" "")))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "ior<mode>3" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (ior:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "") + (match_operand:VEC_L 2 "vlogical_operand" "")))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "and<mode>3" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (and:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "") + (match_operand:VEC_L 2 "vlogical_operand" "")))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "one_cmpl<mode>2" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (not:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "nor<mode>3" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (not:VEC_L (ior:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "") + (match_operand:VEC_L 2 "vlogical_operand" ""))))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +(define_expand "andc<mode>3" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (and:VEC_L (not:VEC_L (match_operand:VEC_L 2 "vlogical_operand" "")) + (match_operand:VEC_L 1 "vlogical_operand" "")))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" + "") + +;; Same size conversions +(define_expand "float<VEC_int><mode>2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (float:VEC_F (match_operand:<VEC_INT> 1 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + " +{ + if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode)) + { + emit_insn (gen_altivec_vcfsx (operands[0], operands[1], const0_rtx)); + DONE; + } +}") + +(define_expand "unsigned_float<VEC_int><mode>2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (unsigned_float:VEC_F (match_operand:<VEC_INT> 1 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + " +{ + if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode)) + { + emit_insn (gen_altivec_vcfux (operands[0], operands[1], const0_rtx)); + DONE; + } +}") + +(define_expand "fix_trunc<mode><VEC_int>2" + [(set (match_operand:<VEC_INT> 0 "vint_operand" "") + (fix:<VEC_INT> (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + " +{ + if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode)) + { + emit_insn (gen_altivec_vctsxs (operands[0], operands[1], const0_rtx)); + DONE; + } +}") + +(define_expand "fixuns_trunc<mode><VEC_int>2" + [(set (match_operand:<VEC_INT> 0 "vint_operand" "") + (unsigned_fix:<VEC_INT> (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" + " +{ + if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode)) + { + emit_insn (gen_altivec_vctuxs (operands[0], operands[1], const0_rtx)); + DONE; + } +}") + + +;; Vector initialization, set, extract +(define_expand "vec_init<mode>" + [(match_operand:VEC_E 0 "vlogical_operand" "") + (match_operand:VEC_E 1 "" "")] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" +{ + rs6000_expand_vector_init (operands[0], operands[1]); + DONE; +}) + +(define_expand "vec_set<mode>" + [(match_operand:VEC_E 0 "vlogical_operand" "") + (match_operand:<VEC_base> 1 "register_operand" "") + (match_operand 2 "const_int_operand" "")] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" +{ + rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2])); + DONE; +}) + +(define_expand "vec_extract<mode>" + [(match_operand:<VEC_base> 0 "register_operand" "") + (match_operand:VEC_E 1 "vlogical_operand" "") + (match_operand 2 "const_int_operand" "")] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" +{ + rs6000_expand_vector_extract (operands[0], operands[1], + INTVAL (operands[2])); + DONE; +}) + +;; Interleave patterns +(define_expand "vec_interleave_highv4sf" + [(set (match_operand:V4SF 0 "vfloat_operand" "") + (vec_merge:V4SF + (vec_select:V4SF (match_operand:V4SF 1 "vfloat_operand" "") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (vec_select:V4SF (match_operand:V4SF 2 "vfloat_operand" "") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (const_int 5)))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)" + "") + +(define_expand "vec_interleave_lowv4sf" + [(set (match_operand:V4SF 0 "vfloat_operand" "") + (vec_merge:V4SF + (vec_select:V4SF (match_operand:V4SF 1 "vfloat_operand" "") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (vec_select:V4SF (match_operand:V4SF 2 "vfloat_operand" "") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (const_int 5)))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)" + "") + +(define_expand "vec_interleave_highv2df" + [(set (match_operand:V2DF 0 "vfloat_operand" "") + (vec_concat:V2DF + (vec_select:DF (match_operand:V2DF 1 "vfloat_operand" "") + (parallel [(const_int 0)])) + (vec_select:DF (match_operand:V2DF 2 "vfloat_operand" "") + (parallel [(const_int 0)]))))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "") + +(define_expand "vec_interleave_lowv2df" + [(set (match_operand:V2DF 0 "vfloat_operand" "") + (vec_concat:V2DF + (vec_select:DF (match_operand:V2DF 1 "vfloat_operand" "") + (parallel [(const_int 1)])) + (vec_select:DF (match_operand:V2DF 2 "vfloat_operand" "") + (parallel [(const_int 1)]))))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "") + + +;; Convert double word types to single word types +(define_expand "vec_pack_trunc_v2df" + [(match_operand:V4SF 0 "vfloat_operand" "") + (match_operand:V2DF 1 "vfloat_operand" "") + (match_operand:V2DF 2 "vfloat_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && TARGET_ALTIVEC" +{ + rtx r1 = gen_reg_rtx (V4SFmode); + rtx r2 = gen_reg_rtx (V4SFmode); + + emit_insn (gen_vsx_xvcvdpsp (r1, operands[1])); + emit_insn (gen_vsx_xvcvdpsp (r2, operands[2])); + emit_insn (gen_vec_extract_evenv4sf (operands[0], r1, r2)); + DONE; +}) + +(define_expand "vec_pack_sfix_trunc_v2df" + [(match_operand:V4SI 0 "vint_operand" "") + (match_operand:V2DF 1 "vfloat_operand" "") + (match_operand:V2DF 2 "vfloat_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && TARGET_ALTIVEC" +{ + rtx r1 = gen_reg_rtx (V4SImode); + rtx r2 = gen_reg_rtx (V4SImode); + + emit_insn (gen_vsx_xvcvdpsxws (r1, operands[1])); + emit_insn (gen_vsx_xvcvdpsxws (r2, operands[2])); + emit_insn (gen_vec_extract_evenv4si (operands[0], r1, r2)); + DONE; +}) + +(define_expand "vec_pack_ufix_trunc_v2df" + [(match_operand:V4SI 0 "vint_operand" "") + (match_operand:V2DF 1 "vfloat_operand" "") + (match_operand:V2DF 2 "vfloat_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && TARGET_ALTIVEC" +{ + rtx r1 = gen_reg_rtx (V4SImode); + rtx r2 = gen_reg_rtx (V4SImode); + + emit_insn (gen_vsx_xvcvdpuxws (r1, operands[1])); + emit_insn (gen_vsx_xvcvdpuxws (r2, operands[2])); + emit_insn (gen_vec_extract_evenv4si (operands[0], r1, r2)); + DONE; +}) + +;; Convert single word types to double word +(define_expand "vec_unpacks_hi_v4sf" + [(match_operand:V2DF 0 "vfloat_operand" "") + (match_operand:V4SF 1 "vfloat_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)" +{ + rtx reg = gen_reg_rtx (V4SFmode); + + emit_insn (gen_vec_interleave_highv4sf (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvspdp (operands[0], reg)); + DONE; +}) + +(define_expand "vec_unpacks_lo_v4sf" + [(match_operand:V2DF 0 "vfloat_operand" "") + (match_operand:V4SF 1 "vfloat_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)" +{ + rtx reg = gen_reg_rtx (V4SFmode); + + emit_insn (gen_vec_interleave_lowv4sf (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvspdp (operands[0], reg)); + DONE; +}) + +(define_expand "vec_unpacks_float_hi_v4si" + [(match_operand:V2DF 0 "vfloat_operand" "") + (match_operand:V4SI 1 "vint_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SImode)" +{ + rtx reg = gen_reg_rtx (V4SImode); + + emit_insn (gen_vec_interleave_highv4si (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvsxwdp (operands[0], reg)); + DONE; +}) + +(define_expand "vec_unpacks_float_lo_v4si" + [(match_operand:V2DF 0 "vfloat_operand" "") + (match_operand:V4SI 1 "vint_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SImode)" +{ + rtx reg = gen_reg_rtx (V4SImode); + + emit_insn (gen_vec_interleave_lowv4si (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvsxwdp (operands[0], reg)); + DONE; +}) + +(define_expand "vec_unpacku_float_hi_v4si" + [(match_operand:V2DF 0 "vfloat_operand" "") + (match_operand:V4SI 1 "vint_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SImode)" +{ + rtx reg = gen_reg_rtx (V4SImode); + + emit_insn (gen_vec_interleave_highv4si (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvuxwdp (operands[0], reg)); + DONE; +}) + +(define_expand "vec_unpacku_float_lo_v4si" + [(match_operand:V2DF 0 "vfloat_operand" "") + (match_operand:V4SI 1 "vint_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SImode)" +{ + rtx reg = gen_reg_rtx (V4SImode); + + emit_insn (gen_vec_interleave_lowv4si (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvuxwdp (operands[0], reg)); + DONE; +}) + + +;; Align vector loads with a permute. +(define_expand "vec_realign_load_<mode>" + [(match_operand:VEC_K 0 "vlogical_operand" "") + (match_operand:VEC_K 1 "vlogical_operand" "") + (match_operand:VEC_K 2 "vlogical_operand" "") + (match_operand:V16QI 3 "vlogical_operand" "")] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)" +{ + emit_insn (gen_altivec_vperm_<mode> (operands[0], operands[1], operands[2], + operands[3])); + DONE; +}) + +;; Under VSX, vectors of 4/8 byte alignments do not need to be aligned +;; since the load already handles it. +(define_expand "movmisalign<mode>" + [(set (match_operand:VEC_N 0 "vfloat_operand" "") + (match_operand:VEC_N 1 "vfloat_operand" ""))] + "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_ALLOW_MOVMISALIGN" + "") + + +;; Vector shift left in bits. Currently supported ony for shift +;; amounts that can be expressed as byte shifts (divisible by 8). +;; General shift amounts can be supported using vslo + vsl. We're +;; not expecting to see these yet (the vectorizer currently +;; generates only shifts divisible by byte_size). +(define_expand "vec_shl_<mode>" + [(match_operand:VEC_L 0 "vlogical_operand" "") + (match_operand:VEC_L 1 "vlogical_operand" "") + (match_operand:QI 2 "reg_or_short_operand" "")] + "TARGET_ALTIVEC" + " +{ + rtx bitshift = operands[2]; + rtx shift; + rtx insn; + HOST_WIDE_INT bitshift_val; + HOST_WIDE_INT byteshift_val; + + if (! CONSTANT_P (bitshift)) + FAIL; + bitshift_val = INTVAL (bitshift); + if (bitshift_val & 0x7) + FAIL; + byteshift_val = bitshift_val >> 3; + if (TARGET_VSX && (byteshift_val & 0x3) == 0) + { + shift = gen_rtx_CONST_INT (QImode, byteshift_val >> 2); + insn = gen_vsx_xxsldwi_<mode> (operands[0], operands[1], operands[1], + shift); + } + else + { + shift = gen_rtx_CONST_INT (QImode, byteshift_val); + insn = gen_altivec_vsldoi_<mode> (operands[0], operands[1], operands[1], + shift); + } + + emit_insn (insn); + DONE; +}") + +;; Vector shift right in bits. Currently supported ony for shift +;; amounts that can be expressed as byte shifts (divisible by 8). +;; General shift amounts can be supported using vsro + vsr. We're +;; not expecting to see these yet (the vectorizer currently +;; generates only shifts divisible by byte_size). +(define_expand "vec_shr_<mode>" + [(match_operand:VEC_L 0 "vlogical_operand" "") + (match_operand:VEC_L 1 "vlogical_operand" "") + (match_operand:QI 2 "reg_or_short_operand" "")] + "TARGET_ALTIVEC" + " +{ + rtx bitshift = operands[2]; + rtx shift; + rtx insn; + HOST_WIDE_INT bitshift_val; + HOST_WIDE_INT byteshift_val; + + if (! CONSTANT_P (bitshift)) + FAIL; + bitshift_val = INTVAL (bitshift); + if (bitshift_val & 0x7) + FAIL; + byteshift_val = 16 - (bitshift_val >> 3); + if (TARGET_VSX && (byteshift_val & 0x3) == 0) + { + shift = gen_rtx_CONST_INT (QImode, byteshift_val >> 2); + insn = gen_vsx_xxsldwi_<mode> (operands[0], operands[1], operands[1], + shift); + } + else + { + shift = gen_rtx_CONST_INT (QImode, byteshift_val); + insn = gen_altivec_vsldoi_<mode> (operands[0], operands[1], operands[1], + shift); + } + + emit_insn (insn); + DONE; +}") + +;; Expanders for rotate each element in a vector +(define_expand "vrotl<mode>3" + [(set (match_operand:VEC_I 0 "vint_operand" "") + (rotate:VEC_I (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "TARGET_ALTIVEC" + "") + +;; Expanders for arithmetic shift left on each vector element +(define_expand "vashl<mode>3" + [(set (match_operand:VEC_I 0 "vint_operand" "") + (ashift:VEC_I (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "TARGET_ALTIVEC" + "") + +;; Expanders for logical shift right on each vector element +(define_expand "vlshr<mode>3" + [(set (match_operand:VEC_I 0 "vint_operand" "") + (lshiftrt:VEC_I (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "TARGET_ALTIVEC" + "") + +;; Expanders for arithmetic shift right on each vector element +(define_expand "vashr<mode>3" + [(set (match_operand:VEC_I 0 "vint_operand" "") + (ashiftrt:VEC_I (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "TARGET_ALTIVEC" + "") Index: gcc-4.3.4-20091019/gcc/config/rs6000/vsx.md =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ gcc-4.3.4-20091019/gcc/config/rs6000/vsx.md 2009-10-19 13:40:37.000000000 +0200 @@ -0,0 +1,1293 @@ +;; VSX patterns. +;; Copyright (C) 2009 +;; Free Software Foundation, Inc. +;; Contributed by Michael Meissner <meissner@linux.vnet.ibm.com> + +;; This file is part of GCC. + +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published +;; by the Free Software Foundation; either version 3, or (at your +;; option) any later version. + +;; GCC is distributed in the hope that it will be useful, but WITHOUT +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY +;; or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public +;; License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; <http://www.gnu.org/licenses/>. + +;; Iterator for both scalar and vector floating point types supported by VSX +(define_mode_iterator VSX_B [DF V4SF V2DF]) + +;; Iterator for the 2 64-bit vector types +(define_mode_iterator VSX_D [V2DF V2DI]) + +;; Iterator for the 2 32-bit vector types +(define_mode_iterator VSX_W [V4SF V4SI]) + +;; Iterator for vector floating point types supported by VSX +(define_mode_iterator VSX_F [V4SF V2DF]) + +;; Iterator for logical types supported by VSX +(define_mode_iterator VSX_L [V16QI V8HI V4SI V2DI V4SF V2DF TI]) + +;; Iterator for memory move. Handle TImode specially to allow +;; it to use gprs as well as vsx registers. +(define_mode_iterator VSX_M [V16QI V8HI V4SI V2DI V4SF V2DF]) + +;; Map into the appropriate load/store name based on the type +(define_mode_attr VSm [(V16QI "vw4") + (V8HI "vw4") + (V4SI "vw4") + (V4SF "vw4") + (V2DF "vd2") + (V2DI "vd2") + (DF "d") + (TI "vw4")]) + +;; Map into the appropriate suffix based on the type +(define_mode_attr VSs [(V16QI "sp") + (V8HI "sp") + (V4SI "sp") + (V4SF "sp") + (V2DF "dp") + (V2DI "dp") + (DF "dp") + (SF "sp") + (TI "sp")]) + +;; Map the register class used +(define_mode_attr VSr [(V16QI "v") + (V8HI "v") + (V4SI "v") + (V4SF "wf") + (V2DI "wd") + (V2DF "wd") + (DF "ws") + (SF "d") + (TI "wd")]) + +;; Map the register class used for float<->int conversions +(define_mode_attr VSr2 [(V2DF "wd") + (V4SF "wf") + (DF "!f#r")]) + +(define_mode_attr VSr3 [(V2DF "wa") + (V4SF "wa") + (DF "!f#r")]) + +;; Map the register class for sp<->dp float conversions, destination +(define_mode_attr VSr4 [(SF "ws") + (DF "f") + (V2DF "wd") + (V4SF "v")]) + +;; Map the register class for sp<->dp float conversions, destination +(define_mode_attr VSr5 [(SF "ws") + (DF "f") + (V2DF "v") + (V4SF "wd")]) + +;; Same size integer type for floating point data +(define_mode_attr VSi [(V4SF "v4si") + (V2DF "v2di") + (DF "di")]) + +(define_mode_attr VSI [(V4SF "V4SI") + (V2DF "V2DI") + (DF "DI")]) + +;; Word size for same size conversion +(define_mode_attr VSc [(V4SF "w") + (V2DF "d") + (DF "d")]) + +;; Map into either s or v, depending on whether this is a scalar or vector +;; operation +(define_mode_attr VSv [(V16QI "v") + (V8HI "v") + (V4SI "v") + (V4SF "v") + (V2DI "v") + (V2DF "v") + (TI "v") + (DF "s")]) + +;; Appropriate type for add ops (and other simple FP ops) +(define_mode_attr VStype_simple [(V2DF "vecfloat") + (V4SF "vecfloat") + (DF "fp")]) + +(define_mode_attr VSfptype_simple [(V2DF "fp_addsub_d") + (V4SF "fp_addsub_s") + (DF "fp_addsub_d")]) + +;; Appropriate type for multiply ops +(define_mode_attr VStype_mul [(V2DF "vecfloat") + (V4SF "vecfloat") + (DF "dmul")]) + +(define_mode_attr VSfptype_mul [(V2DF "fp_mul_d") + (V4SF "fp_mul_s") + (DF "fp_mul_d")]) + +;; Appropriate type for divide ops. For now, just lump the vector divide with +;; the scalar divides +(define_mode_attr VStype_div [(V2DF "ddiv") + (V4SF "sdiv") + (DF "ddiv")]) + +(define_mode_attr VSfptype_div [(V2DF "fp_div_d") + (V4SF "fp_div_s") + (DF "fp_div_d")]) + +;; Appropriate type for sqrt ops. For now, just lump the vector sqrt with +;; the scalar sqrt +(define_mode_attr VStype_sqrt [(V2DF "dsqrt") + (V4SF "sdiv") + (DF "ddiv")]) + +(define_mode_attr VSfptype_sqrt [(V2DF "fp_sqrt_d") + (V4SF "fp_sqrt_s") + (DF "fp_sqrt_d")]) + +;; Iterator and modes for sp<->dp conversions +;; Because scalar SF values are represented internally as double, use the +;; V4SF type to represent this than SF. +(define_mode_iterator VSX_SPDP [DF V4SF V2DF]) + +(define_mode_attr VS_spdp_res [(DF "V4SF") + (V4SF "V2DF") + (V2DF "V4SF")]) + +(define_mode_attr VS_spdp_insn [(DF "xscvdpsp") + (V4SF "xvcvspdp") + (V2DF "xvcvdpsp")]) + +(define_mode_attr VS_spdp_type [(DF "fp") + (V4SF "vecfloat") + (V2DF "vecfloat")]) + +;; Map the scalar mode for a vector type +(define_mode_attr VS_scalar [(V2DF "DF") + (V2DI "DI") + (V4SF "SF") + (V4SI "SI") + (V8HI "HI") + (V16QI "QI")]) + +;; Constants for creating unspecs +(define_constants + [(UNSPEC_VSX_CONCAT 500) + (UNSPEC_VSX_CVDPSXWS 501) + (UNSPEC_VSX_CVDPUXWS 502) + (UNSPEC_VSX_CVSPDP 503) + (UNSPEC_VSX_CVSXWDP 504) + (UNSPEC_VSX_CVUXWDP 505) + (UNSPEC_VSX_CVSXDSP 506) + (UNSPEC_VSX_CVUXDSP 507) + (UNSPEC_VSX_CVSPSXDS 508) + (UNSPEC_VSX_CVSPUXDS 509) + (UNSPEC_VSX_MADD 510) + (UNSPEC_VSX_MSUB 511) + (UNSPEC_VSX_NMADD 512) + (UNSPEC_VSX_NMSUB 513) + (UNSPEC_VSX_RSQRTE 514) + (UNSPEC_VSX_TDIV 515) + (UNSPEC_VSX_TSQRT 516) + (UNSPEC_VSX_XXPERMDI 517) + (UNSPEC_VSX_SET 518) + (UNSPEC_VSX_ROUND_I 519) + (UNSPEC_VSX_ROUND_IC 520) + (UNSPEC_VSX_SLDWI 521)]) + +;; VSX moves +(define_insn "*vsx_mov<mode>" + [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=Z,<VSr>,<VSr>,?Z,?wa,?wa,*o,*r,*r,<VSr>,?wa,v,wZ,v") + (match_operand:VSX_M 1 "input_operand" "<VSr>,Z,<VSr>,wa,Z,wa,r,o,r,j,j,W,v,wZ"))] + "VECTOR_MEM_VSX_P (<MODE>mode) + && (register_operand (operands[0], <MODE>mode) + || register_operand (operands[1], <MODE>mode))" +{ + switch (which_alternative) + { + case 0: + case 3: + gcc_assert (MEM_P (operands[0]) + && GET_CODE (XEXP (operands[0], 0)) != PRE_INC + && GET_CODE (XEXP (operands[0], 0)) != PRE_DEC + && GET_CODE (XEXP (operands[0], 0)) != PRE_MODIFY); + return "stx<VSm>x %x1,%y0"; + + case 1: + case 4: + gcc_assert (MEM_P (operands[1]) + && GET_CODE (XEXP (operands[1], 0)) != PRE_INC + && GET_CODE (XEXP (operands[1], 0)) != PRE_DEC + && GET_CODE (XEXP (operands[1], 0)) != PRE_MODIFY); + return "lx<VSm>x %x0,%y1"; + + case 2: + case 5: + return "xxlor %x0,%x1,%x1"; + + case 6: + case 7: + case 8: + return "#"; + + case 9: + case 10: + return "xxlxor %x0,%x0,%x0"; + + case 11: + return output_vec_const_move (operands); + + case 12: + gcc_assert (MEM_P (operands[0]) + && GET_CODE (XEXP (operands[0], 0)) != PRE_INC + && GET_CODE (XEXP (operands[0], 0)) != PRE_DEC + && GET_CODE (XEXP (operands[0], 0)) != PRE_MODIFY); + return "stvx %1,%y0"; + + case 13: + gcc_assert (MEM_P (operands[0]) + && GET_CODE (XEXP (operands[0], 0)) != PRE_INC + && GET_CODE (XEXP (operands[0], 0)) != PRE_DEC + && GET_CODE (XEXP (operands[0], 0)) != PRE_MODIFY); + return "lvx %0,%y1"; + + default: + gcc_unreachable (); + } +} + [(set_attr "type" "vecstore,vecload,vecsimple,vecstore,vecload,vecsimple,*,*,*,vecsimple,vecsimple,*,vecstore,vecload")]) + +;; Unlike other VSX moves, allow the GPRs, since a normal use of TImode is for +;; unions. However for plain data movement, slightly favor the vector loads +(define_insn "*vsx_movti" + [(set (match_operand:TI 0 "nonimmediate_operand" "=Z,wa,wa,?o,?r,?r,wa,v,v,wZ") + (match_operand:TI 1 "input_operand" "wa,Z,wa,r,o,r,j,W,wZ,v"))] + "VECTOR_MEM_VSX_P (TImode) + && (register_operand (operands[0], TImode) + || register_operand (operands[1], TImode))" +{ + switch (which_alternative) + { + case 0: + return "stxvd2x %x1,%y0"; + + case 1: + return "lxvd2x %x0,%y1"; + + case 2: + return "xxlor %x0,%x1,%x1"; + + case 3: + case 4: + case 5: + return "#"; + + case 6: + return "xxlxor %x0,%x0,%x0"; + + case 7: + return output_vec_const_move (operands); + + case 8: + return "stvx %1,%y0"; + + case 9: + return "lvx %0,%y1"; + + default: + gcc_unreachable (); + } +} + [(set_attr "type" "vecstore,vecload,vecsimple,*,*,*,vecsimple,*,vecstore,vecload")]) + + +;; VSX scalar and vector floating point arithmetic instructions +(define_insn "*vsx_add<mode>3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (plus:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>add<VSs> %x0,%x1,%x2" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "*vsx_sub<mode>3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (minus:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>sub<VSs> %x0,%x1,%x2" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "*vsx_mul<mode>3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (mult:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>mul<VSs> %x0,%x1,%x2" + [(set_attr "type" "<VStype_mul>") + (set_attr "fp_type" "<VSfptype_mul>")]) + +(define_insn "*vsx_div<mode>3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (div:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>div<VSs> %x0,%x1,%x2" + [(set_attr "type" "<VStype_div>") + (set_attr "fp_type" "<VSfptype_div>")]) + +;; *tdiv* instruction returning the FG flag +(define_expand "vsx_tdiv<mode>3_fg" + [(set (match_dup 3) + (unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand" "") + (match_operand:VSX_B 2 "vsx_register_operand" "")] + UNSPEC_VSX_TDIV)) + (set (match_operand:SI 0 "gpc_reg_operand" "") + (gt:SI (match_dup 3) + (const_int 0)))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" +{ + operands[3] = gen_reg_rtx (CCFPmode); +}) + +;; *tdiv* instruction returning the FE flag +(define_expand "vsx_tdiv<mode>3_fe" + [(set (match_dup 3) + (unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand" "") + (match_operand:VSX_B 2 "vsx_register_operand" "")] + UNSPEC_VSX_TDIV)) + (set (match_operand:SI 0 "gpc_reg_operand" "") + (eq:SI (match_dup 3) + (const_int 0)))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" +{ + operands[3] = gen_reg_rtx (CCFPmode); +}) + +(define_insn "*vsx_tdiv<mode>3_internal" + [(set (match_operand:CCFP 0 "cc_reg_operand" "=x,x") + (unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")] + UNSPEC_VSX_TDIV))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>tdiv<VSs> %0,%x1,%x2" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "vsx_fre<mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")] + UNSPEC_FRES))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>re<VSs> %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "*vsx_neg<mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (neg:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>neg<VSs> %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "*vsx_abs<mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (abs:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>abs<VSs> %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "vsx_nabs<mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (neg:VSX_B + (abs:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa"))))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>nabs<VSs> %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "vsx_smax<mode>3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (smax:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>max<VSs> %x0,%x1,%x2" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "*vsx_smin<mode>3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (smin:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>min<VSs> %x0,%x1,%x2" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "*vsx_sqrt<mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (sqrt:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>sqrt<VSs> %x0,%x1" + [(set_attr "type" "<VStype_sqrt>") + (set_attr "fp_type" "<VSfptype_sqrt>")]) + +(define_insn "vsx_rsqrte<mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")] + UNSPEC_VSX_RSQRTE))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>rsqrte<VSs> %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +;; *tsqrt* returning the fg flag +(define_expand "vsx_tsqrt<mode>2_fg" + [(set (match_dup 3) + (unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand" "")] + UNSPEC_VSX_TSQRT)) + (set (match_operand:SI 0 "gpc_reg_operand" "") + (gt:SI (match_dup 3) + (const_int 0)))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" +{ + operands[3] = gen_reg_rtx (CCFPmode); +}) + +;; *tsqrt* returning the fe flag +(define_expand "vsx_tsqrt<mode>2_fe" + [(set (match_dup 3) + (unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand" "")] + UNSPEC_VSX_TSQRT)) + (set (match_operand:SI 0 "gpc_reg_operand" "") + (eq:SI (match_dup 3) + (const_int 0)))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" +{ + operands[3] = gen_reg_rtx (CCFPmode); +}) + +(define_insn "*vsx_tsqrt<mode>2_internal" + [(set (match_operand:CCFP 0 "cc_reg_operand" "=x,x") + (unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")] + UNSPEC_VSX_TSQRT))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>tsqrt<VSs> %0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +;; Fused vector multiply/add instructions + +;; Note we have a pattern for the multiply/add operations that uses unspec and +;; does not check -mfused-madd to allow users to use these ops when they know +;; they want the fused multiply/add. + +(define_expand "vsx_fmadd<mode>4" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "") + (plus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "") + (match_operand:VSX_B 2 "vsx_register_operand" "")) + (match_operand:VSX_B 3 "vsx_register_operand" "")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" +{ + if (!TARGET_FUSED_MADD) + { + emit_insn (gen_vsx_fmadd<mode>4_2 (operands[0], operands[1], operands[2], + operands[3])); + DONE; + } +}) + +(define_insn "*vsx_fmadd<mode>4_1" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa") + (plus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")) + (match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode) && TARGET_FUSED_MADD" + "@ + x<VSv>madda<VSs> %x0,%x1,%x2 + x<VSv>maddm<VSs> %x0,%x1,%x3 + x<VSv>madda<VSs> %x0,%x1,%x2 + x<VSv>maddm<VSs> %x0,%x1,%x3" + [(set_attr "type" "<VStype_mul>") + (set_attr "fp_type" "<VSfptype_mul>")]) + +(define_insn "vsx_fmadd<mode>4_2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0") + (match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")] + UNSPEC_VSX_MADD))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "@ + x<VSv>madda<VSs> %x0,%x1,%x2 + x<VSv>maddm<VSs> %x0,%x1,%x3 + x<VSv>madda<VSs> %x0,%x1,%x2 + x<VSv>maddm<VSs> %x0,%x1,%x3" + [(set_attr "type" "<VStype_mul>") + (set_attr "fp_type" "<VSfptype_mul>")]) + +(define_expand "vsx_fmsub<mode>4" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "") + (minus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "") + (match_operand:VSX_B 2 "vsx_register_operand" "")) + (match_operand:VSX_B 3 "vsx_register_operand" "")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" +{ + if (!TARGET_FUSED_MADD) + { + emit_insn (gen_vsx_fmsub<mode>4_2 (operands[0], operands[1], operands[2], + operands[3])); + DONE; + } +}) + +(define_insn "*vsx_fmsub<mode>4_1" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa") + (minus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")) + (match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode) && TARGET_FUSED_MADD" + "@ + x<VSv>msuba<VSs> %x0,%x1,%x2 + x<VSv>msubm<VSs> %x0,%x1,%x3 + x<VSv>msuba<VSs> %x0,%x1,%x2 + x<VSv>msubm<VSs> %x0,%x1,%x3" + [(set_attr "type" "<VStype_mul>") + (set_attr "fp_type" "<VSfptype_mul>")]) + +(define_insn "vsx_fmsub<mode>4_2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0") + (match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")] + UNSPEC_VSX_MSUB))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "@ + x<VSv>msuba<VSs> %x0,%x1,%x2 + x<VSv>msubm<VSs> %x0,%x1,%x3 + x<VSv>msuba<VSs> %x0,%x1,%x2 + x<VSv>msubm<VSs> %x0,%x1,%x3" + [(set_attr "type" "<VStype_mul>") + (set_attr "fp_type" "<VSfptype_mul>")]) + +(define_expand "vsx_fnmadd<mode>4" + [(match_operand:VSX_B 0 "vsx_register_operand" "") + (match_operand:VSX_B 1 "vsx_register_operand" "") + (match_operand:VSX_B 2 "vsx_register_operand" "") + (match_operand:VSX_B 3 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (<MODE>mode)" +{ + if (TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (DFmode)) + { + emit_insn (gen_vsx_fnmadd<mode>4_1 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } + else if (TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (DFmode)) + { + emit_insn (gen_vsx_fnmadd<mode>4_2 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } + else + { + emit_insn (gen_vsx_fnmadd<mode>4_3 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } +}) + +(define_insn "vsx_fnmadd<mode>4_1" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa") + (neg:VSX_B + (plus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,<VSr>,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")) + (match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa"))))] + "VECTOR_UNIT_VSX_P (<MODE>mode) && TARGET_FUSED_MADD + && HONOR_SIGNED_ZEROS (DFmode)" + "@ + x<VSv>nmadda<VSs> %x0,%x1,%x2 + x<VSv>nmaddm<VSs> %x0,%x1,%x3 + x<VSv>nmadda<VSs> %x0,%x1,%x2 + x<VSv>nmaddm<VSs> %x0,%x1,%x3" + [(set_attr "type" "<VStype_mul>") + (set_attr "fp_type" "<VSfptype_mul>")]) + +(define_insn "vsx_fnmadd<mode>4_2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa") + (minus:VSX_B + (mult:VSX_B + (neg:VSX_B + (match_operand:VSX_B 1 "gpc_reg_operand" "<VSr>,<VSr>,wa,wa")) + (match_operand:VSX_B 2 "gpc_reg_operand" "<VSr>,0,wa,0")) + (match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode) && TARGET_FUSED_MADD + && !HONOR_SIGNED_ZEROS (DFmode)" + "@ + x<VSv>nmadda<VSs> %x0,%x1,%x2 + x<VSv>nmaddm<VSs> %x0,%x1,%x3 + x<VSv>nmadda<VSs> %x0,%x1,%x2 + x<VSv>nmaddm<VSs> %x0,%x1,%x3" + [(set_attr "type" "<VStype_mul>") + (set_attr "fp_type" "<VSfptype_mul>")]) + +(define_insn "vsx_fnmadd<mode>4_3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,<VSr>,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0") + (match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")] + UNSPEC_VSX_NMADD))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "@ + x<VSv>nmadda<VSs> %x0,%x1,%x2 + x<VSv>nmaddm<VSs> %x0,%x1,%x3 + x<VSv>nmadda<VSs> %x0,%x1,%x2 + x<VSv>nmaddm<VSs> %x0,%x1,%x3" + [(set_attr "type" "<VStype_mul>") + (set_attr "fp_type" "<VSfptype_mul>")]) + +(define_expand "vsx_fnmsub<mode>4" + [(match_operand:VSX_B 0 "vsx_register_operand" "") + (match_operand:VSX_B 1 "vsx_register_operand" "") + (match_operand:VSX_B 2 "vsx_register_operand" "") + (match_operand:VSX_B 3 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (<MODE>mode)" +{ + if (TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (DFmode)) + { + emit_insn (gen_vsx_fnmsub<mode>4_1 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } + else if (TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (DFmode)) + { + emit_insn (gen_vsx_fnmsub<mode>4_2 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } + else + { + emit_insn (gen_vsx_fnmsub<mode>4_3 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } +}) + +(define_insn "vsx_fnmsub<mode>4_1" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa") + (neg:VSX_B + (minus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")) + (match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa"))))] + "VECTOR_UNIT_VSX_P (<MODE>mode) && TARGET_FUSED_MADD + && HONOR_SIGNED_ZEROS (DFmode)" + "@ + x<VSv>nmsuba<VSs> %x0,%x1,%x2 + x<VSv>nmsubm<VSs> %x0,%x1,%x3 + x<VSv>nmsuba<VSs> %x0,%x1,%x2 + x<VSv>nmsubm<VSs> %x0,%x1,%x3" + [(set_attr "type" "<VStype_mul>") + (set_attr "fp_type" "<VSfptype_mul>")]) + +(define_insn "vsx_fnmsub<mode>4_2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa") + (minus:VSX_B + (match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa") + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0"))))] + "VECTOR_UNIT_VSX_P (<MODE>mode) && TARGET_FUSED_MADD + && !HONOR_SIGNED_ZEROS (DFmode)" + "@ + x<VSv>nmsuba<VSs> %x0,%x1,%x2 + x<VSv>nmsubm<VSs> %x0,%x1,%x3 + x<VSv>nmsuba<VSs> %x0,%x1,%x2 + x<VSv>nmsubm<VSs> %x0,%x1,%x3" + [(set_attr "type" "<VStype_mul>") + (set_attr "fp_type" "<VSfptype_mul>")]) + +(define_insn "vsx_fnmsub<mode>4_3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0") + (match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")] + UNSPEC_VSX_NMSUB))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "@ + x<VSv>nmsuba<VSs> %x0,%x1,%x2 + x<VSv>nmsubm<VSs> %x0,%x1,%x3 + x<VSv>nmsuba<VSs> %x0,%x1,%x2 + x<VSv>nmsubm<VSs> %x0,%x1,%x3" + [(set_attr "type" "<VStype_mul>") + (set_attr "fp_type" "<VSfptype_mul>")]) + +;; Vector conditional expressions (no scalar version for these instructions) +(define_insn "vsx_eq<mode>" + [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa") + (eq:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "xvcmpeq<VSs> %x0,%x1,%x2" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "vsx_gt<mode>" + [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa") + (gt:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "xvcmpgt<VSs> %x0,%x1,%x2" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "*vsx_ge<mode>" + [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa") + (ge:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "xvcmpge<VSs> %x0,%x1,%x2" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +;; Floating point scalar compare +(define_insn "*vsx_cmpdf_internal1" + [(set (match_operand:CCFP 0 "cc_reg_operand" "=y,?y") + (compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "ws,wa") + (match_operand:DF 2 "gpc_reg_operand" "ws,wa")))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && VECTOR_UNIT_VSX_P (DFmode)" + "xscmpudp %0,%x1,%x2" + [(set_attr "type" "fpcompare")]) + +;; Compare vectors producing a vector result and a predicate, setting CR6 to +;; indicate a combined status +(define_insn "*vsx_eq_<mode>_p" + [(set (reg:CC 74) + (unspec:CC + [(eq:CC (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,?wa") + (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,?wa"))] + UNSPEC_PREDICATE)) + (set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa") + (eq:VSX_F (match_dup 1) + (match_dup 2)))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "xvcmpeq<VSs>. %x0,%x1,%x2" + [(set_attr "type" "veccmp")]) + +(define_insn "*vsx_gt_<mode>_p" + [(set (reg:CC 74) + (unspec:CC + [(gt:CC (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,?wa") + (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,?wa"))] + UNSPEC_PREDICATE)) + (set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa") + (gt:VSX_F (match_dup 1) + (match_dup 2)))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "xvcmpgt<VSs>. %x0,%x1,%x2" + [(set_attr "type" "veccmp")]) + +(define_insn "*vsx_ge_<mode>_p" + [(set (reg:CC 74) + (unspec:CC + [(ge:CC (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,?wa") + (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,?wa"))] + UNSPEC_PREDICATE)) + (set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa") + (ge:VSX_F (match_dup 1) + (match_dup 2)))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "xvcmpge<VSs>. %x0,%x1,%x2" + [(set_attr "type" "veccmp")]) + +;; Vector select +(define_insn "*vsx_xxsel<mode>" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa") + (if_then_else:VSX_L + (ne:CC (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa") + (const_int 0)) + (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxsel %x0,%x3,%x2,%x1" + [(set_attr "type" "vecperm")]) + +(define_insn "*vsx_xxsel<mode>_uns" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa") + (if_then_else:VSX_L + (ne:CCUNS (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa") + (const_int 0)) + (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxsel %x0,%x3,%x2,%x1" + [(set_attr "type" "vecperm")]) + +;; Copy sign +(define_insn "vsx_copysign<mode>3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (if_then_else:VSX_B + (ge:VSX_B (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa") + (match_operand:VSX_B 3 "zero_constant" "j,j")) + (abs:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")) + (neg:VSX_B (abs:VSX_B (match_dup 1)))))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>cpsgn<VSs> %x0,%x2,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +;; For the conversions, limit the register class for the integer value to be +;; the fprs because we don't want to add the altivec registers to movdi/movsi. +;; For the unsigned tests, there isn't a generic double -> unsigned conversion +;; in rs6000.md so don't test VECTOR_UNIT_VSX_P, just test against VSX. +(define_insn "vsx_float<VSi><mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (float:VSX_B (match_operand:<VSI> 1 "vsx_register_operand" "<VSr2>,<VSr3>")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>cvsx<VSc><VSs> %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "vsx_floatuns<VSi><mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (unsigned_float:VSX_B (match_operand:<VSI> 1 "vsx_register_operand" "<VSr2>,<VSr3>")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>cvux<VSc><VSs> %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "vsx_fix_trunc<mode><VSi>2" + [(set (match_operand:<VSI> 0 "vsx_register_operand" "=<VSr2>,?<VSr3>") + (fix:<VSI> (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>cv<VSs>sx<VSc>s %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "vsx_fixuns_trunc<mode><VSi>2" + [(set (match_operand:<VSI> 0 "vsx_register_operand" "=<VSr2>,?<VSr3>") + (unsigned_fix:<VSI> (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>cv<VSs>ux<VSc>s %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +;; Math rounding functions +(define_insn "vsx_x<VSv>r<VSs>i" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")] + UNSPEC_VSX_ROUND_I))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>r<VSs>i %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "vsx_x<VSv>r<VSs>ic" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")] + UNSPEC_VSX_ROUND_IC))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>r<VSs>ic %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "vsx_btrunc<mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (fix:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>r<VSs>iz %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "*vsx_b2trunc<mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")] + UNSPEC_FRIZ))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>r<VSs>iz %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "vsx_floor<mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")] + UNSPEC_FRIM))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>r<VSs>im %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + +(define_insn "vsx_ceil<mode>2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")] + UNSPEC_FRIP))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "x<VSv>r<VSs>ip %x0,%x1" + [(set_attr "type" "<VStype_simple>") + (set_attr "fp_type" "<VSfptype_simple>")]) + + +;; VSX convert to/from double vector + +;; Convert between single and double precision +;; Don't use xscvspdp and xscvdpsp for scalar conversions, since the normal +;; scalar single precision instructions internally use the double format. +;; Prefer the altivec registers, since we likely will need to do a vperm +(define_insn "vsx_<VS_spdp_insn>" + [(set (match_operand:<VS_spdp_res> 0 "vsx_register_operand" "=<VSr4>,?wa") + (unspec:<VS_spdp_res> [(match_operand:VSX_SPDP 1 "vsx_register_operand" "<VSr5>,wa")] + UNSPEC_VSX_CVSPDP))] + "VECTOR_UNIT_VSX_P (<MODE>mode)" + "<VS_spdp_insn> %x0,%x1" + [(set_attr "type" "<VS_spdp_type>")]) + +;; xscvspdp, represent the scalar SF type as V4SF +(define_insn "vsx_xscvspdp" + [(set (match_operand:DF 0 "vsx_register_operand" "=ws,?wa") + (unspec:DF [(match_operand:V4SF 1 "vsx_register_operand" "wa,wa")] + UNSPEC_VSX_CVSPDP))] + "VECTOR_UNIT_VSX_P (DFmode)" + "xscvspdp %x0,%x1" + [(set_attr "type" "fp")]) + +;; xscvdpsp used for splat'ing a scalar to V4SF, knowing that the internal SF +;; format of scalars is actually DF. +(define_insn "vsx_xscvdpsp_scalar" + [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa") + (unspec:V4SF [(match_operand:SF 1 "vsx_register_operand" "f")] + UNSPEC_VSX_CVSPDP))] + "VECTOR_UNIT_VSX_P (DFmode)" + "xscvdpsp %x0,%x1" + [(set_attr "type" "fp")]) + +;; Convert from 64-bit to 32-bit types +;; Note, favor the Altivec registers since the usual use of these instructions +;; is in vector converts and we need to use the Altivec vperm instruction. + +(define_insn "vsx_xvcvdpsxws" + [(set (match_operand:V4SI 0 "vsx_register_operand" "=v,?wa") + (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wd,wa")] + UNSPEC_VSX_CVDPSXWS))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvdpsxws %x0,%x1" + [(set_attr "type" "vecfloat")]) + +(define_insn "vsx_xvcvdpuxws" + [(set (match_operand:V4SI 0 "vsx_register_operand" "=v,?wa") + (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wd,wa")] + UNSPEC_VSX_CVDPUXWS))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvdpuxws %x0,%x1" + [(set_attr "type" "vecfloat")]) + +(define_insn "vsx_xvcvsxdsp" + [(set (match_operand:V4SI 0 "vsx_register_operand" "=wd,?wa") + (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wf,wa")] + UNSPEC_VSX_CVSXDSP))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvsxdsp %x0,%x1" + [(set_attr "type" "vecfloat")]) + +(define_insn "vsx_xvcvuxdsp" + [(set (match_operand:V4SI 0 "vsx_register_operand" "=wd,?wa") + (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wf,wa")] + UNSPEC_VSX_CVUXDSP))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvuxwdp %x0,%x1" + [(set_attr "type" "vecfloat")]) + +;; Convert from 32-bit to 64-bit types +(define_insn "vsx_xvcvsxwdp" + [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa") + (unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")] + UNSPEC_VSX_CVSXWDP))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvsxwdp %x0,%x1" + [(set_attr "type" "vecfloat")]) + +(define_insn "vsx_xvcvuxwdp" + [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa") + (unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")] + UNSPEC_VSX_CVUXWDP))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvuxwdp %x0,%x1" + [(set_attr "type" "vecfloat")]) + +(define_insn "vsx_xvcvspsxds" + [(set (match_operand:V2DI 0 "vsx_register_operand" "=v,?wa") + (unspec:V2DI [(match_operand:V4SF 1 "vsx_register_operand" "wd,wa")] + UNSPEC_VSX_CVSPSXDS))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvspsxds %x0,%x1" + [(set_attr "type" "vecfloat")]) + +(define_insn "vsx_xvcvspuxds" + [(set (match_operand:V2DI 0 "vsx_register_operand" "=v,?wa") + (unspec:V2DI [(match_operand:V4SF 1 "vsx_register_operand" "wd,wa")] + UNSPEC_VSX_CVSPUXDS))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvspuxds %x0,%x1" + [(set_attr "type" "vecfloat")]) + +;; Logical and permute operations +(define_insn "*vsx_and<mode>3" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa") + (and:VSX_L + (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa") + (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa")))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxland %x0,%x1,%x2" + [(set_attr "type" "vecsimple")]) + +(define_insn "*vsx_ior<mode>3" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa") + (ior:VSX_L (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa") + (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa")))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxlor %x0,%x1,%x2" + [(set_attr "type" "vecsimple")]) + +(define_insn "*vsx_xor<mode>3" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa") + (xor:VSX_L + (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa") + (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa")))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxlxor %x0,%x1,%x2" + [(set_attr "type" "vecsimple")]) + +(define_insn "*vsx_one_cmpl<mode>2" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa") + (not:VSX_L + (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxlnor %x0,%x1,%x1" + [(set_attr "type" "vecsimple")]) + +(define_insn "*vsx_nor<mode>3" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa") + (not:VSX_L + (ior:VSX_L + (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa") + (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa"))))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxlnor %x0,%x1,%x2" + [(set_attr "type" "vecsimple")]) + +(define_insn "*vsx_andc<mode>3" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa") + (and:VSX_L + (not:VSX_L + (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa")) + (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxlandc %x0,%x1,%x2" + [(set_attr "type" "vecsimple")]) + + +;; Permute operations + +;; Build a V2DF/V2DI vector from two scalars +(define_insn "vsx_concat_<mode>" + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?wa") + (unspec:VSX_D + [(match_operand:<VS_scalar> 1 "vsx_register_operand" "ws,wa") + (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,wa")] + UNSPEC_VSX_CONCAT))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxpermdi %x0,%x1,%x2,0" + [(set_attr "type" "vecperm")]) + +;; Special purpose concat using xxpermdi to glue two single precision values +;; together, relying on the fact that internally scalar floats are represented +;; as doubles. This is used to initialize a V4SF vector with 4 floats +(define_insn "vsx_concat_v2sf" + [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa") + (unspec:V2DF + [(match_operand:SF 1 "vsx_register_operand" "f,f") + (match_operand:SF 2 "vsx_register_operand" "f,f")] + UNSPEC_VSX_CONCAT))] + "VECTOR_MEM_VSX_P (V2DFmode)" + "xxpermdi %x0,%x1,%x2,0" + [(set_attr "type" "vecperm")]) + +;; Set the element of a V2DI/VD2F mode +(define_insn "vsx_set_<mode>" + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?wa") + (unspec:VSX_D [(match_operand:VSX_D 1 "vsx_register_operand" "wd,wa") + (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,wa") + (match_operand:QI 3 "u5bit_cint_operand" "i,i")] + UNSPEC_VSX_SET))] + "VECTOR_MEM_VSX_P (<MODE>mode)" +{ + if (INTVAL (operands[3]) == 0) + return \"xxpermdi %x0,%x1,%x2,1\"; + else if (INTVAL (operands[3]) == 1) + return \"xxpermdi %x0,%x2,%x1,0\"; + else + gcc_unreachable (); +} + [(set_attr "type" "vecperm")]) + +;; Extract a DF/DI element from V2DF/V2DI +(define_insn "vsx_extract_<mode>" + [(set (match_operand:<VS_scalar> 0 "vsx_register_operand" "=ws,d,?wa") + (vec_select:<VS_scalar> (match_operand:VSX_D 1 "vsx_register_operand" "wd,wd,wa") + (parallel + [(match_operand:QI 2 "u5bit_cint_operand" "i,i,i")])))] + "VECTOR_MEM_VSX_P (<MODE>mode)" +{ + gcc_assert (UINTVAL (operands[2]) <= 1); + operands[3] = GEN_INT (INTVAL (operands[2]) << 1); + return \"xxpermdi %x0,%x1,%x1,%3\"; +} + [(set_attr "type" "vecperm")]) + +;; Optimize extracting element 0 from memory +(define_insn "*vsx_extract_<mode>_zero" + [(set (match_operand:<VS_scalar> 0 "vsx_register_operand" "=ws,d,?wa") + (vec_select:<VS_scalar> + (match_operand:VSX_D 1 "indexed_or_indirect_operand" "Z,Z,Z") + (parallel [(const_int 0)])))] + "VECTOR_MEM_VSX_P (<MODE>mode) && WORDS_BIG_ENDIAN" + "lxsd%U1x %x0,%y1" + [(set_attr "type" "fpload") + (set_attr "length" "4")]) + +;; General double word oriented permute, allow the other vector types for +;; optimizing the permute instruction. +(define_insn "vsx_xxpermdi_<mode>" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=wd,?wa") + (unspec:VSX_L [(match_operand:VSX_L 1 "vsx_register_operand" "wd,wa") + (match_operand:VSX_L 2 "vsx_register_operand" "wd,wa") + (match_operand:QI 3 "u5bit_cint_operand" "i,i")] + UNSPEC_VSX_XXPERMDI))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxpermdi %x0,%x1,%x2,%3" + [(set_attr "type" "vecperm")]) + +;; Varient of xxpermdi that is emitted by the vec_interleave functions +(define_insn "*vsx_xxpermdi2_<mode>" + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd") + (vec_concat:VSX_D + (vec_select:<VS_scalar> + (match_operand:VSX_D 1 "vsx_register_operand" "wd") + (parallel + [(match_operand:QI 2 "u5bit_cint_operand" "i")])) + (vec_select:<VS_scalar> + (match_operand:VSX_D 3 "vsx_register_operand" "wd") + (parallel + [(match_operand:QI 4 "u5bit_cint_operand" "i")]))))] + "VECTOR_MEM_VSX_P (<MODE>mode)" +{ + gcc_assert ((UINTVAL (operands[2]) <= 1) && (UINTVAL (operands[4]) <= 1)); + operands[5] = GEN_INT (((INTVAL (operands[2]) & 1) << 1) + | (INTVAL (operands[4]) & 1)); + return \"xxpermdi %x0,%x1,%x3,%5\"; +} + [(set_attr "type" "vecperm")]) + +;; V2DF/V2DI splat +(define_insn "vsx_splat_<mode>" + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,wd,wd,?wa,?wa,?wa") + (vec_duplicate:VSX_D + (match_operand:<VS_scalar> 1 "input_operand" "ws,f,Z,wa,wa,Z")))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "@ + xxpermdi %x0,%x1,%x1,0 + xxpermdi %x0,%x1,%x1,0 + lxvdsx %x0,%y1 + xxpermdi %x0,%x1,%x1,0 + xxpermdi %x0,%x1,%x1,0 + lxvdsx %x0,%y1" + [(set_attr "type" "vecperm,vecperm,vecload,vecperm,vecperm,vecload")]) + +;; V4SF/V4SI splat +(define_insn "vsx_xxspltw_<mode>" + [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa") + (vec_duplicate:VSX_W + (vec_select:<VS_scalar> + (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa") + (parallel + [(match_operand:QI 2 "u5bit_cint_operand" "i,i")]))))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxspltw %x0,%x1,%2" + [(set_attr "type" "vecperm")]) + +;; V4SF/V4SI interleave +(define_insn "vsx_xxmrghw_<mode>" + [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa") + (vec_merge:VSX_W + (vec_select:VSX_W + (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (vec_select:VSX_W + (match_operand:VSX_W 2 "vsx_register_operand" "wf,wa") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (const_int 5)))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxmrghw %x0,%x1,%x2" + [(set_attr "type" "vecperm")]) + +(define_insn "vsx_xxmrglw_<mode>" + [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa") + (vec_merge:VSX_W + (vec_select:VSX_W + (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (vec_select:VSX_W + (match_operand:VSX_W 2 "vsx_register_operand" "wf,?wa") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (const_int 5)))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxmrglw %x0,%x1,%x2" + [(set_attr "type" "vecperm")]) + +;; Shift left double by word immediate +(define_insn "vsx_xxsldwi_<mode>" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=wa") + (unspec:VSX_L [(match_operand:VSX_L 1 "vsx_register_operand" "wa") + (match_operand:VSX_L 2 "vsx_register_operand" "wa") + (match_operand:QI 3 "u5bit_cint_operand" "i")] + UNSPEC_VSX_SLDWI))] + "VECTOR_MEM_VSX_P (<MODE>mode)" + "xxsldwi %x0,%x1,%x2,%3" + [(set_attr "type" "vecperm")]) Index: gcc-4.3.4-20091019/gcc/config/rs6000/xfpu.md =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ gcc-4.3.4-20091019/gcc/config/rs6000/xfpu.md 2009-10-19 13:40:37.000000000 +0200 @@ -0,0 +1,140 @@ +;; Scheduling description for the Xilinx PowerPC 405 APU Floating Point Unit. +;; Copyright (C) 2008 Free Software Foundation, Inc. +;; Contributed by Michael Eager (eager@eagercon.com). +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; <http://www.gnu.org/licenses/>. + +;;---------------------------------------------------- +;; Xilinx APU FPU Pipeline Description +;; +;; - attr 'type' and 'fp_type' should definitely +;; be cleaned up at some point in the future. +;; ddiv,sdiv,dmul,smul etc are quite confusing. +;; Should use consistent fp* attrs. 'fp_type' +;; should also go away, leaving us only with 'fp' +;; +;;---------------------------------------------------- + +;; ------------------------------------------------------------------------- +;; Latencies +;; Latest latency figures (all in FCB cycles). PowerPC to FPU frequency ratio +;; assumed to be 1/2. (most common deployment) +;; Add 2 PPC cycles for (register file access + wb) and 2 PPC cycles +;; for issue (from PPC) +;; SP DP +;; Loads: 4 6 +;; Stores: 1 2 (from availability of data) +;; Move/Abs/Neg: 1 1 +;; Add/Subtract: 5 7 +;; Multiply: 4 11 +;; Multiply-add: 10 19 +;; Convert (any): 4 6 +;; Divide/Sqrt: 27 56 +;; Compares: 1 2 +;; +;; bypasses needed for forwarding capability of the FPU. +;; Add this at some future time. +;; ------------------------------------------------------------------------- +(define_automaton "Xfpu") +(define_cpu_unit "Xfpu_issue,Xfpu_addsub,Xfpu_mul,Xfpu_div,Xfpu_sqrt" "Xfpu") + + +(define_insn_reservation "fp-default" 2 + (and (and + (eq_attr "type" "fp") + (eq_attr "fp_type" "fp_default")) + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*2") + +(define_insn_reservation "fp-compare" 6 + (and (eq_attr "type" "fpcompare") ;; Inconsistent naming + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*2,Xfpu_addsub") + +(define_insn_reservation "fp-addsub-s" 14 + (and (and + (eq_attr "type" "fp") + (eq_attr "fp_type" "fp_addsub_s")) + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*2,Xfpu_addsub") + +(define_insn_reservation "fp-addsub-d" 18 + (and (and + (eq_attr "type" "fp") + (eq_attr "fp_type" "fp_addsub_d")) + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*2,Xfpu_addsub") + +(define_insn_reservation "fp-mul-s" 12 + (and (and + (eq_attr "type" "fp") + (eq_attr "fp_type" "fp_mul_s")) + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*2,Xfpu_mul") + +(define_insn_reservation "fp-mul-d" 16 ;; Actually 28. Long latencies are killing the automaton formation. Need to figure out why. + (and (and + (eq_attr "type" "fp") + (eq_attr "fp_type" "fp_mul_d")) + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*2,Xfpu_mul") + +(define_insn_reservation "fp-div-s" 24 ;; Actually 34 + (and (eq_attr "type" "sdiv") ;; Inconsistent attr naming + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*2,Xfpu_div*10") ;; Unpipelined + +(define_insn_reservation "fp-div-d" 34 ;; Actually 116 + (and (eq_attr "type" "ddiv") + (eq_attr "cpu" "ppc405")) ;; Inconsistent attr naming + "Xfpu_issue*2,Xfpu_div*10") ;; Unpipelined + +(define_insn_reservation "fp-maddsub-s" 24 + (and (and + (eq_attr "type" "fp") + (eq_attr "fp_type" "fp_maddsub_s")) + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*2,Xfpu_mul,nothing*7,Xfpu_addsub") + +(define_insn_reservation "fp-maddsub-d" 34 ;; Actually 42 + (and (and + (eq_attr "type" "dmul") ;; Inconsistent attr naming + (eq_attr "fp_type" "fp_maddsub_d")) + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*2,Xfpu_mul,nothing*7,Xfpu_addsub") + +(define_insn_reservation "fp-load" 10 ;; FIXME. Is double/single precision the same ? + (and (eq_attr "type" "fpload, fpload_ux, fpload_u") + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*10") + +(define_insn_reservation "fp-store" 4 + (and (eq_attr "type" "fpstore, fpstore_ux, fpstore_u") + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*4") + +(define_insn_reservation "fp-sqrt-s" 24 ;; Actually 56 + (and (eq_attr "type" "ssqrt") + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*2,Xfpu_sqrt*10") ;; Unpipelined + + +(define_insn_reservation "fp-sqrt-d" 34 ;; Actually 116 + (and (eq_attr "type" "dsqrt") + (eq_attr "cpu" "ppc405")) + "Xfpu_issue*2,Xfpu_sqrt*10") ;; Unpipelined + Index: gcc-4.3.4-20091019/gcc/config.in =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config.in 2009-10-19 13:39:51.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/config.in 2009-10-19 13:40:37.000000000 +0200 @@ -338,6 +338,12 @@ #endif +/* Define if your assembler supports POPCNTD instructions. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_POPCNTD +#endif + + /* Define if your assembler supports .register. */ #ifndef USED_FOR_TARGET #undef HAVE_AS_REGISTER_PSEUDO_OP @@ -375,11 +381,13 @@ #undef HAVE_AS_TLS #endif + /* Define if your assembler supports VSX instructions. */ #ifndef USED_FOR_TARGET #undef HAVE_AS_VSX #endif + /* Define to 1 if you have the `atoll' function. */ #ifndef USED_FOR_TARGET #undef HAVE_ATOLL @@ -733,6 +741,12 @@ #endif +/* Define to 1 if you have the <dlfcn.h> header file. */ +#ifndef USED_FOR_TARGET +#undef HAVE_DLFCN_H +#endif + + /* Define to 1 if you have the <fcntl.h> header file. */ #ifndef USED_FOR_TARGET #undef HAVE_FCNTL_H @@ -1319,6 +1333,13 @@ #endif +/* Define to the sub-directory in which libtool stores uninstalled libraries. + */ +#ifndef USED_FOR_TARGET +#undef LT_OBJDIR +#endif + + /* Define if host mkdir takes a single argument. */ #ifndef USED_FOR_TARGET #undef MKDIR_TAKES_ONE_ARG @@ -1373,37 +1394,37 @@ #endif -/* The size of `int', as computed by sizeof. */ +/* The size of a `int', as computed by sizeof. */ #ifndef USED_FOR_TARGET #undef SIZEOF_INT #endif -/* The size of `long', as computed by sizeof. */ +/* The size of a `long', as computed by sizeof. */ #ifndef USED_FOR_TARGET #undef SIZEOF_LONG #endif -/* The size of `long long', as computed by sizeof. */ +/* The size of a `long long', as computed by sizeof. */ #ifndef USED_FOR_TARGET #undef SIZEOF_LONG_LONG #endif -/* The size of `short', as computed by sizeof. */ +/* The size of a `short', as computed by sizeof. */ #ifndef USED_FOR_TARGET #undef SIZEOF_SHORT #endif -/* The size of `void *', as computed by sizeof. */ +/* The size of a `void *', as computed by sizeof. */ #ifndef USED_FOR_TARGET #undef SIZEOF_VOID_P #endif -/* The size of `__int64', as computed by sizeof. */ +/* The size of a `__int64', as computed by sizeof. */ #ifndef USED_FOR_TARGET #undef SIZEOF___INT64 #endif Index: gcc-4.3.4-20091019/gcc/configure =================================================================== --- gcc-4.3.4-20091019.orig/gcc/configure 2009-10-19 13:39:51.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/configure 2009-10-19 13:40:37.000000000 +0200 @@ -21882,6 +21882,144 @@ _ACEOF fi case $target in + *-*-aix*) conftest_s=' .machine "pwr7" + .csect .text[PR] + lxvd2x 1,2,3';; + *) conftest_s=' .machine power7 + .text + lxvd2x 1,2,3';; + esac + + echo "$as_me:$LINENO: checking assembler for vector-scalar support" >&5 +echo $ECHO_N "checking assembler for vector-scalar support... $ECHO_C" >&6 +if test "${gcc_cv_as_powerpc_vsx+set}" = set; then + echo $ECHO_N "(cached) $ECHO_C" >&6 +else + gcc_cv_as_powerpc_vsx=no + if test $in_tree_gas = yes; then + if test $gcc_cv_gas_vers -ge `expr \( \( 9 \* 1000 \) + 99 \) \* 1000 + 0` + then gcc_cv_as_powerpc_vsx=yes +fi + elif test x$gcc_cv_as != x; then + echo "$conftest_s" > conftest.s + if { ac_try='$gcc_cv_as -a32 -o conftest.o conftest.s >&5' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } + then + gcc_cv_as_powerpc_vsx=yes + else + echo "configure: failed program was" >&5 + cat conftest.s >&5 + fi + rm -f conftest.o conftest.s + fi +fi +echo "$as_me:$LINENO: result: $gcc_cv_as_powerpc_vsx" >&5 +echo "${ECHO_T}$gcc_cv_as_powerpc_vsx" >&6 +if test $gcc_cv_as_powerpc_vsx = yes; then + +cat >>confdefs.h <<\_ACEOF +#define HAVE_AS_VSX 1 +_ACEOF + +fi + + case $target in + *-*-aix*) conftest_s=' .machine "pwr7" + .csect .text[PR] + popcntd 3,3';; + *) conftest_s=' .machine power7 + .text + popcntd 3,3';; + esac + + echo "$as_me:$LINENO: checking assembler for popcntd support" >&5 +echo $ECHO_N "checking assembler for popcntd support... $ECHO_C" >&6 +if test "${gcc_cv_as_powerpc_popcntd+set}" = set; then + echo $ECHO_N "(cached) $ECHO_C" >&6 +else + gcc_cv_as_powerpc_popcntd=no + if test $in_tree_gas = yes; then + if test $gcc_cv_gas_vers -ge `expr \( \( 9 \* 1000 \) + 99 \) \* 1000 + 0` + then gcc_cv_as_powerpc_popcntd=yes +fi + elif test x$gcc_cv_as != x; then + echo "$conftest_s" > conftest.s + if { ac_try='$gcc_cv_as -a32 -o conftest.o conftest.s >&5' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } + then + gcc_cv_as_powerpc_popcntd=yes + else + echo "configure: failed program was" >&5 + cat conftest.s >&5 + fi + rm -f conftest.o conftest.s + fi +fi +echo "$as_me:$LINENO: result: $gcc_cv_as_powerpc_popcntd" >&5 +echo "${ECHO_T}$gcc_cv_as_powerpc_popcntd" >&6 +if test $gcc_cv_as_powerpc_popcntd = yes; then + +cat >>confdefs.h <<\_ACEOF +#define HAVE_AS_POPCNTD 1 +_ACEOF + +fi + + case $target in + *-*-aix*) conftest_s=' .machine "pwr7" + .csect .text[PR] + lxvd2x 1,2,3';; + *) conftest_s=' .machine power7 + .text + lxvd2x 1,2,3';; + esac + + echo "$as_me:$LINENO: checking assembler for vector-scalar support" >&5 +echo $ECHO_N "checking assembler for vector-scalar support... $ECHO_C" >&6 +if test "${gcc_cv_as_powerpc_vsx+set}" = set; then + echo $ECHO_N "(cached) $ECHO_C" >&6 +else + gcc_cv_as_powerpc_vsx=no + if test $in_tree_gas = yes; then + if test $gcc_cv_gas_vers -ge `expr \( \( 9 \* 1000 \) + 99 \) \* 1000 + 0` + then gcc_cv_as_powerpc_vsx=yes +fi + elif test x$gcc_cv_as != x; then + echo "$conftest_s" > conftest.s + if { ac_try='$gcc_cv_as -a32 -o conftest.o conftest.s >&5' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } + then + gcc_cv_as_powerpc_vsx=yes + else + echo "configure: failed program was" >&5 + cat conftest.s >&5 + fi + rm -f conftest.o conftest.s + fi +fi +echo "$as_me:$LINENO: result: $gcc_cv_as_powerpc_vsx" >&5 +echo "${ECHO_T}$gcc_cv_as_powerpc_vsx" >&6 +if test $gcc_cv_as_powerpc_vsx = yes; then + +cat >>confdefs.h <<\_ACEOF +#define HAVE_AS_VSX 1 +_ACEOF + +fi + + case $target in *-*-aix*) conftest_s=' .machine "pwr7" .csect .text[PR] lxvd2x 1,2,3';; Index: gcc-4.3.4-20091019/gcc/configure.ac =================================================================== --- gcc-4.3.4-20091019.orig/gcc/configure.ac 2009-10-19 13:39:51.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/configure.ac 2009-10-19 13:40:37.000000000 +0200 @@ -2,7 +2,7 @@ # Process this file with autoconf to generate a configuration script. # Copyright 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, -# 2007, 2008 Free Software Foundation, Inc. +# 2007, 2008, 2009 Free Software Foundation, Inc. #This file is part of GCC. @@ -3014,6 +3014,51 @@ LCF0: [Define if your assembler supports DFP instructions.])]) case $target in + *-*-aix*) conftest_s=' .machine "pwr7" + .csect .text[[PR]] + lxvd2x 1,2,3';; + *) conftest_s=' .machine power7 + .text + lxvd2x 1,2,3';; + esac + + gcc_GAS_CHECK_FEATURE([vector-scalar support], + gcc_cv_as_powerpc_vsx, [9,99,0], -a32, + [$conftest_s],, + [AC_DEFINE(HAVE_AS_VSX, 1, + [Define if your assembler supports VSX instructions.])]) + + case $target in + *-*-aix*) conftest_s=' .machine "pwr7" + .csect .text[[PR]] + popcntd 3,3';; + *) conftest_s=' .machine power7 + .text + popcntd 3,3';; + esac + + gcc_GAS_CHECK_FEATURE([popcntd support], + gcc_cv_as_powerpc_popcntd, [9,99,0], -a32, + [$conftest_s],, + [AC_DEFINE(HAVE_AS_POPCNTD, 1, + [Define if your assembler supports POPCNTD instructions.])]) + + case $target in + *-*-aix*) conftest_s=' .machine "pwr7" + .csect .text[[PR]] + lxvd2x 1,2,3';; + *) conftest_s=' .machine power7 + .text + lxvd2x 1,2,3';; + esac + + gcc_GAS_CHECK_FEATURE([vector-scalar support], + gcc_cv_as_powerpc_vsx, [9,99,0], -a32, + [$conftest_s],, + [AC_DEFINE(HAVE_AS_VSX, 1, + [Define if your assembler supports VSX instructions.])]) + + case $target in *-*-aix*) conftest_s=' .machine "pwr7" .csect .text[[PR]] lxvd2x 1,2,3';; Index: gcc-4.3.4-20091019/gcc/config/rs6000/rs6000.h =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/rs6000.h 2009-10-19 13:39:52.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/config/rs6000/rs6000.h 2009-10-19 13:40:37.000000000 +0200 @@ -1,6 +1,6 @@ /* Definitions of target machine for GNU compiler, for IBM RS/6000. Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, - 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007 + 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc. Contributed by Richard Kenner (kenner@vlsi1.ultra.nyu.edu) @@ -16,8 +16,13 @@ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. - You should have received a copy of the GNU General Public License - along with GCC; see the file COPYING3. If not see + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see <http://www.gnu.org/licenses/>. */ /* Note that some other tm.h files include this one and then override @@ -72,14 +77,16 @@ #define ASM_CPU_POWER6_SPEC "-mpower4 -maltivec" #endif -#ifdef HAVE_AS_VSX +#ifdef HAVE_AS_POPCNTD #define ASM_CPU_POWER7_SPEC "-mpower7" #else #define ASM_CPU_POWER7_SPEC "-mpower4 -maltivec" #endif -/* Common ASM definitions used by ASM_SPEC among the various targets - for handling -mcpu=xxx switches. */ +/* Common ASM definitions used by ASM_SPEC among the various targets for + handling -mcpu=xxx switches. There is a parallel list in driver-rs6000.c to + provide the default assembler options if the user uses -mcpu=native, so if + you make changes here, make them also there. */ #define ASM_CPU_SPEC \ "%{!mcpu*: \ %{mpower: %{!mpower2: -mpwr}} \ @@ -88,6 +95,7 @@ %{!mpowerpc64*: %{mpowerpc*: -mppc}} \ %{mno-power: %{!mpowerpc*: -mcom}} \ %{!mno-power: %{!mpower*: %(asm_default)}}} \ +%{mcpu=native: %(asm_cpu_native)} \ %{mcpu=common: -mcom} \ %{mcpu=cell: -mcell} \ %{mcpu=power: -mpwr} \ @@ -112,6 +120,8 @@ %{mcpu=405fp: -m405} \ %{mcpu=440: -m440} \ %{mcpu=440fp: -m440} \ +%{mcpu=464: -m440} \ +%{mcpu=464fp: -m440} \ %{mcpu=505: -mppc} \ %{mcpu=601: -m601} \ %{mcpu=602: -mppc} \ @@ -136,6 +146,9 @@ %{mcpu=G5: -mpower4 -maltivec} \ %{mcpu=8540: -me500} \ %{mcpu=8548: -me500} \ +%{mcpu=e300c2: -me300} \ +%{mcpu=e300c3: -me300} \ +%{mcpu=e500mc: -me500mc} \ %{maltivec: -maltivec} \ -many" @@ -158,6 +171,7 @@ #define EXTRA_SPECS \ { "cpp_default", CPP_DEFAULT_SPEC }, \ { "asm_cpu", ASM_CPU_SPEC }, \ + { "asm_cpu_native", ASM_CPU_NATIVE_SPEC }, \ { "asm_default", ASM_DEFAULT_SPEC }, \ { "cc1_cpu", CC1_CPU_SPEC }, \ { "asm_cpu_power5", ASM_CPU_POWER5_SPEC }, \ @@ -174,6 +188,10 @@ extern const char *host_detect_local_cpu #define EXTRA_SPEC_FUNCTIONS \ { "local_cpu_detect", host_detect_local_cpu }, #define HAVE_LOCAL_CPU_DETECT +#define ASM_CPU_NATIVE_SPEC "%:local_cpu_detect(asm)" + +#else +#define ASM_CPU_NATIVE_SPEC "%(asm_default)" #endif #ifndef CC1_CPU_SPEC @@ -235,6 +253,31 @@ extern const char *host_detect_local_cpu #define TARGET_DFP 0 #endif +/* Define TARGET_POPCNTD if the target assembler does not support the + popcount word and double word instructions. */ + +#ifndef HAVE_AS_POPCNTD +#undef TARGET_POPCNTD +#define TARGET_POPCNTD 0 +#endif + +/* Define TARGET_LWSYNC_INSTRUCTION if the assembler knows about lwsync. If + not, generate the lwsync code as an integer constant. */ +#ifdef HAVE_AS_LWSYNC +#define TARGET_LWSYNC_INSTRUCTION 1 +#else +#define TARGET_LWSYNC_INSTRUCTION 0 +#endif + +/* Define TARGET_TLS_MARKERS if the target assembler does not support + arg markers for __tls_get_addr calls. */ +#ifndef HAVE_AS_TLS_MARKERS +#undef TARGET_TLS_MARKERS +#define TARGET_TLS_MARKERS 0 +#else +#define TARGET_TLS_MARKERS tls_markers +#endif + #ifndef TARGET_SECURE_PLT #define TARGET_SECURE_PLT 0 #endif @@ -284,12 +327,25 @@ enum processor_type PROCESSOR_PPC7400, PROCESSOR_PPC7450, PROCESSOR_PPC8540, + PROCESSOR_PPCE300C2, + PROCESSOR_PPCE300C3, + PROCESSOR_PPCE500MC, PROCESSOR_POWER4, PROCESSOR_POWER5, PROCESSOR_POWER6, + PROCESSOR_POWER7, PROCESSOR_CELL }; +/* FPU operations supported. + Each use of TARGET_SINGLE_FLOAT or TARGET_DOUBLE_FLOAT must + also test TARGET_HARD_FLOAT. */ +#define TARGET_SINGLE_FLOAT 1 +#define TARGET_DOUBLE_FLOAT 1 +#define TARGET_SINGLE_FPU 0 +#define TARGET_SIMPLE_FPU 0 +#define TARGET_XILINX_FPU 0 + extern enum processor_type rs6000_cpu; /* Recast the processor type to the cpu attribute. */ @@ -305,6 +361,18 @@ extern enum processor_type rs6000_cpu; #define PROCESSOR_DEFAULT PROCESSOR_RIOS1 #define PROCESSOR_DEFAULT64 PROCESSOR_RS64A +/* FP processor type. */ +enum fpu_type_t +{ + FPU_NONE, /* No FPU */ + FPU_SF_LITE, /* Limited Single Precision FPU */ + FPU_DF_LITE, /* Limited Double Precision FPU */ + FPU_SF_FULL, /* Full Single Precision FPU */ + FPU_DF_FULL /* Full Double Single Precision FPU */ +}; + +extern enum fpu_type_t fpu_type; + /* Specify the dialect of assembler to use. New mnemonics is dialect one and the old mnemonics are dialect zero. */ #define ASSEMBLER_DIALECT (TARGET_NEW_MNEMONICS ? 1 : 0) @@ -359,9 +427,15 @@ extern struct rs6000_cpu_select rs6000_s extern const char *rs6000_debug_name; /* Name for -mdebug-xxxx option */ extern int rs6000_debug_stack; /* debug stack applications */ extern int rs6000_debug_arg; /* debug argument handling */ +extern int rs6000_debug_reg; /* debug register handling */ +extern int rs6000_debug_addr; /* debug memory addressing */ +extern int rs6000_debug_cost; /* debug rtx_costs */ #define TARGET_DEBUG_STACK rs6000_debug_stack #define TARGET_DEBUG_ARG rs6000_debug_arg +#define TARGET_DEBUG_REG rs6000_debug_reg +#define TARGET_DEBUG_ADDR rs6000_debug_addr +#define TARGET_DEBUG_COST rs6000_debug_cost extern const char *rs6000_traceback_name; /* Type of traceback table. */ @@ -371,10 +445,65 @@ extern int rs6000_long_double_type_size; extern int rs6000_ieeequad; extern int rs6000_altivec_abi; extern int rs6000_spe_abi; +extern int rs6000_spe; extern int rs6000_float_gprs; extern int rs6000_alignment_flags; extern const char *rs6000_sched_insert_nops_str; extern enum rs6000_nop_insertion rs6000_sched_insert_nops; +extern int rs6000_xilinx_fpu; + +/* Describe which vector unit to use for a given machine mode. */ +enum rs6000_vector { + VECTOR_NONE, /* Type is not a vector or not supported */ + VECTOR_ALTIVEC, /* Use altivec for vector processing */ + VECTOR_VSX, /* Use VSX for vector processing */ + VECTOR_PAIRED, /* Use paired floating point for vectors */ + VECTOR_SPE, /* Use SPE for vector processing */ + VECTOR_OTHER /* Some other vector unit */ +}; + +extern enum rs6000_vector rs6000_vector_unit[]; + +#define VECTOR_UNIT_NONE_P(MODE) \ + (rs6000_vector_unit[(MODE)] == VECTOR_NONE) + +#define VECTOR_UNIT_VSX_P(MODE) \ + (rs6000_vector_unit[(MODE)] == VECTOR_VSX) + +#define VECTOR_UNIT_ALTIVEC_P(MODE) \ + (rs6000_vector_unit[(MODE)] == VECTOR_ALTIVEC) + +#define VECTOR_UNIT_ALTIVEC_OR_VSX_P(MODE) \ + (rs6000_vector_unit[(MODE)] == VECTOR_ALTIVEC \ + || rs6000_vector_unit[(MODE)] == VECTOR_VSX) + +/* Describe whether to use VSX loads or Altivec loads. For now, just use the + same unit as the vector unit we are using, but we may want to migrate to + using VSX style loads even for types handled by altivec. */ +extern enum rs6000_vector rs6000_vector_mem[]; + +#define VECTOR_MEM_NONE_P(MODE) \ + (rs6000_vector_mem[(MODE)] == VECTOR_NONE) + +#define VECTOR_MEM_VSX_P(MODE) \ + (rs6000_vector_mem[(MODE)] == VECTOR_VSX) + +#define VECTOR_MEM_ALTIVEC_P(MODE) \ + (rs6000_vector_mem[(MODE)] == VECTOR_ALTIVEC) + +#define VECTOR_MEM_ALTIVEC_OR_VSX_P(MODE) \ + (rs6000_vector_mem[(MODE)] == VECTOR_ALTIVEC \ + || rs6000_vector_mem[(MODE)] == VECTOR_VSX) + +/* Return the alignment of a given vector type, which is set based on the + vector unit use. VSX for instance can load 32 or 64 bit aligned words + without problems, while Altivec requires 128-bit aligned vectors. */ +extern int rs6000_vector_align[]; + +#define VECTOR_ALIGN(MODE) \ + ((rs6000_vector_align[(MODE)] != 0) \ + ? rs6000_vector_align[(MODE)] \ + : (int)GET_MODE_BITSIZE ((MODE))) /* Alignment options for fields in structures for sub-targets following AIX-like ABI. @@ -396,11 +525,12 @@ extern enum rs6000_nop_insertion rs6000_ #define TARGET_LONG_DOUBLE_128 (rs6000_long_double_type_size == 128) #define TARGET_IEEEQUAD rs6000_ieeequad #define TARGET_ALTIVEC_ABI rs6000_altivec_abi +#define TARGET_LDBRX (TARGET_POPCNTD || rs6000_cpu == PROCESSOR_CELL) #define TARGET_SPE_ABI 0 #define TARGET_SPE 0 #define TARGET_E500 0 -#define TARGET_ISEL 0 +#define TARGET_ISEL64 (TARGET_ISEL && TARGET_POWERPC64) #define TARGET_FPRS 1 #define TARGET_E500_SINGLE 0 #define TARGET_E500_DOUBLE 0 @@ -498,6 +628,7 @@ extern enum rs6000_nop_insertion rs6000_ #endif #define UNITS_PER_FP_WORD 8 #define UNITS_PER_ALTIVEC_WORD 16 +#define UNITS_PER_VSX_WORD 16 #define UNITS_PER_SPE_WORD 8 #define UNITS_PER_PAIRED_WORD 8 @@ -562,14 +693,16 @@ extern enum rs6000_nop_insertion rs6000_ /* Width in bits of a pointer. See also the macro `Pmode' defined below. */ -#define POINTER_SIZE (TARGET_32BIT ? 32 : 64) +extern unsigned rs6000_pointer_size; +#define POINTER_SIZE rs6000_pointer_size /* Allocation boundary (in *bits*) for storing arguments in argument list. */ #define PARM_BOUNDARY (TARGET_32BIT ? 32 : 64) /* Boundary (in *bits*) on which stack pointer should be aligned. */ -#define STACK_BOUNDARY \ - ((TARGET_32BIT && !TARGET_ALTIVEC && !TARGET_ALTIVEC_ABI) ? 64 : 128) +#define STACK_BOUNDARY \ + ((TARGET_32BIT && !TARGET_ALTIVEC && !TARGET_ALTIVEC_ABI && !TARGET_VSX) \ + ? 64 : 128) /* Allocation boundary (in *bits*) for the code of a function. */ #define FUNCTION_BOUNDARY 32 @@ -581,13 +714,7 @@ extern enum rs6000_nop_insertion rs6000_ local store. TYPE is the data type, and ALIGN is the alignment that the object would ordinarily have. */ #define LOCAL_ALIGNMENT(TYPE, ALIGN) \ - ((TARGET_ALTIVEC && TREE_CODE (TYPE) == VECTOR_TYPE) ? 128 : \ - (TARGET_E500_DOUBLE \ - && (TYPE_MODE (TYPE) == DFmode || TYPE_MODE (TYPE) == DDmode)) ? 64 : \ - ((TARGET_SPE && TREE_CODE (TYPE) == VECTOR_TYPE \ - && SPE_VECTOR_MODE (TYPE_MODE (TYPE))) || (TARGET_PAIRED_FLOAT \ - && TREE_CODE (TYPE) == VECTOR_TYPE \ - && PAIRED_VECTOR_MODE (TYPE_MODE (TYPE)))) ? 64 : ALIGN) + DATA_ALIGNMENT (TYPE, ALIGN) /* Alignment of field after `int : 0' in a structure. */ #define EMPTY_FIELD_BOUNDARY 32 @@ -609,7 +736,7 @@ extern enum rs6000_nop_insertion rs6000_ fit into 1, whereas DI still needs two. */ #define MEMBER_TYPE_FORCES_BLK(FIELD, MODE) \ ((TARGET_SPE && TREE_CODE (TREE_TYPE (FIELD)) == VECTOR_TYPE) \ - || (TARGET_E500_DOUBLE && ((MODE) == DFmode || (MODE) == DDmode))) + || (TARGET_E500_DOUBLE && (MODE) == DFmode)) /* A bit-field declared as `int' forces `int' alignment for the struct. */ #define PCC_BITFIELD_TYPE_MATTERS 1 @@ -630,7 +757,7 @@ extern enum rs6000_nop_insertion rs6000_ (TREE_CODE (TYPE) == VECTOR_TYPE ? ((TARGET_SPE_ABI \ || TARGET_PAIRED_FLOAT) ? 64 : 128) \ : (TARGET_E500_DOUBLE \ - && (TYPE_MODE (TYPE) == DFmode || TYPE_MODE (TYPE) == DDmode)) ? 64 \ + && TYPE_MODE (TYPE) == DFmode) ? 64 \ : TREE_CODE (TYPE) == ARRAY_TYPE \ && TYPE_MODE (TREE_TYPE (TYPE)) == QImode \ && (ALIGN) < BITS_PER_WORD ? BITS_PER_WORD : (ALIGN)) @@ -642,15 +769,17 @@ extern enum rs6000_nop_insertion rs6000_ /* Define this macro to be the value 1 if unaligned accesses have a cost many times greater than aligned accesses, for example if they are emulated in a trap handler. */ -/* Altivec vector memory instructions simply ignore the low bits; SPE - vector memory instructions trap on unaligned accesses. */ +/* Altivec vector memory instructions simply ignore the low bits; SPE vector + memory instructions trap on unaligned accesses; VSX memory instructions are + aligned to 4 or 8 bytes. */ #define SLOW_UNALIGNED_ACCESS(MODE, ALIGN) \ (STRICT_ALIGNMENT \ || (((MODE) == SFmode || (MODE) == DFmode || (MODE) == TFmode \ || (MODE) == SDmode || (MODE) == DDmode || (MODE) == TDmode \ || (MODE) == DImode) \ && (ALIGN) < 32) \ - || (VECTOR_MODE_P ((MODE)) && (ALIGN) < GET_MODE_BITSIZE ((MODE)))) + || (VECTOR_MODE_P ((MODE)) && (((int)(ALIGN)) < VECTOR_ALIGN (MODE)))) + /* Standard register usage. */ @@ -877,16 +1006,49 @@ extern enum rs6000_nop_insertion rs6000_ /* True if register is an AltiVec register. */ #define ALTIVEC_REGNO_P(N) ((N) >= FIRST_ALTIVEC_REGNO && (N) <= LAST_ALTIVEC_REGNO) +/* True if register is a VSX register. */ +#define VSX_REGNO_P(N) (FP_REGNO_P (N) || ALTIVEC_REGNO_P (N)) + +/* Alternate name for any vector register supporting floating point, no matter + which instruction set(s) are available. */ +#define VFLOAT_REGNO_P(N) \ + (ALTIVEC_REGNO_P (N) || (TARGET_VSX && FP_REGNO_P (N))) + +/* Alternate name for any vector register supporting integer, no matter which + instruction set(s) are available. */ +#define VINT_REGNO_P(N) ALTIVEC_REGNO_P (N) + +/* Alternate name for any vector register supporting logical operations, no + matter which instruction set(s) are available. */ +#define VLOGICAL_REGNO_P(N) VFLOAT_REGNO_P (N) + /* Return number of consecutive hard regs needed starting at reg REGNO to hold something of mode MODE. */ -#define HARD_REGNO_NREGS(REGNO, MODE) rs6000_hard_regno_nregs ((REGNO), (MODE)) +#define HARD_REGNO_NREGS(REGNO, MODE) rs6000_hard_regno_nregs[(MODE)][(REGNO)] #define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE) \ ((TARGET_32BIT && TARGET_POWERPC64 \ && (GET_MODE_SIZE (MODE) > 4) \ && INT_REGNO_P (REGNO)) ? 1 : 0) +#define VSX_VECTOR_MODE(MODE) \ + ((MODE) == V4SFmode \ + || (MODE) == V2DFmode) \ + +#define VSX_SCALAR_MODE(MODE) \ + ((MODE) == DFmode) + +#define VSX_MODE(MODE) \ + (VSX_VECTOR_MODE (MODE) \ + || VSX_SCALAR_MODE (MODE)) + +#define VSX_MOVE_MODE(MODE) \ + (VSX_VECTOR_MODE (MODE) \ + || VSX_SCALAR_MODE (MODE) \ + || ALTIVEC_VECTOR_MODE (MODE) \ + || (MODE) == TImode) + #define ALTIVEC_VECTOR_MODE(MODE) \ ((MODE) == V16QImode \ || (MODE) == V8HImode \ @@ -902,10 +1064,12 @@ extern enum rs6000_nop_insertion rs6000_ #define PAIRED_VECTOR_MODE(MODE) \ ((MODE) == V2SFmode) -#define UNITS_PER_SIMD_WORD \ - (TARGET_ALTIVEC ? UNITS_PER_ALTIVEC_WORD \ - : (TARGET_SPE ? UNITS_PER_SPE_WORD : (TARGET_PAIRED_FLOAT ? \ - UNITS_PER_PAIRED_WORD : UNITS_PER_WORD))) +#define UNITS_PER_SIMD_WORD \ + (TARGET_VSX ? UNITS_PER_VSX_WORD \ + : (TARGET_ALTIVEC ? UNITS_PER_ALTIVEC_WORD \ + : (TARGET_SPE ? UNITS_PER_SPE_WORD \ + : (TARGET_PAIRED_FLOAT ? UNITS_PER_PAIRED_WORD \ + : UNITS_PER_WORD)))) /* Value is TRUE if hard register REGNO can hold a value of machine-mode MODE. */ @@ -933,6 +1097,10 @@ extern enum rs6000_nop_insertion rs6000_ ? ALTIVEC_VECTOR_MODE (MODE2) \ : ALTIVEC_VECTOR_MODE (MODE2) \ ? ALTIVEC_VECTOR_MODE (MODE1) \ + : VSX_VECTOR_MODE (MODE1) \ + ? VSX_VECTOR_MODE (MODE2) \ + : VSX_VECTOR_MODE (MODE2) \ + ? VSX_VECTOR_MODE (MODE1) \ : 1) /* Post-reload, we can't use any new AltiVec registers, as we already @@ -1024,9 +1192,10 @@ extern enum rs6000_nop_insertion rs6000_ For any two classes, it is very desirable that there be another class that represents their union. */ -/* The RS/6000 has three types of registers, fixed-point, floating-point, - and condition registers, plus three special registers, MQ, CTR, and the - link register. AltiVec adds a vector register class. +/* The RS/6000 has three types of registers, fixed-point, floating-point, and + condition registers, plus three special registers, MQ, CTR, and the link + register. AltiVec adds a vector register class. VSX registers overlap the + FPR registers and the Altivec registers. However, r0 is special in that it cannot be used as a base register. So make a class for registers valid as base registers. @@ -1041,6 +1210,7 @@ enum reg_class GENERAL_REGS, FLOAT_REGS, ALTIVEC_REGS, + VSX_REGS, VRSAVE_REGS, VSCR_REGS, SPE_ACC_REGS, @@ -1071,6 +1241,7 @@ enum reg_class "GENERAL_REGS", \ "FLOAT_REGS", \ "ALTIVEC_REGS", \ + "VSX_REGS", \ "VRSAVE_REGS", \ "VSCR_REGS", \ "SPE_ACC_REGS", \ @@ -1100,6 +1271,7 @@ enum reg_class { 0xffffffff, 0x00000000, 0x00000008, 0x00020000 }, /* GENERAL_REGS */ \ { 0x00000000, 0xffffffff, 0x00000000, 0x00000000 }, /* FLOAT_REGS */ \ { 0x00000000, 0x00000000, 0xffffe000, 0x00001fff }, /* ALTIVEC_REGS */ \ + { 0x00000000, 0xffffffff, 0xffffe000, 0x00001fff }, /* VSX_REGS */ \ { 0x00000000, 0x00000000, 0x00000000, 0x00002000 }, /* VRSAVE_REGS */ \ { 0x00000000, 0x00000000, 0x00000000, 0x00004000 }, /* VSCR_REGS */ \ { 0x00000000, 0x00000000, 0x00000000, 0x00008000 }, /* SPE_ACC_REGS */ \ @@ -1123,29 +1295,40 @@ enum reg_class reg number REGNO. This could be a conditional expression or could index an array. */ -#define REGNO_REG_CLASS(REGNO) \ - ((REGNO) == 0 ? GENERAL_REGS \ - : (REGNO) < 32 ? BASE_REGS \ - : FP_REGNO_P (REGNO) ? FLOAT_REGS \ - : ALTIVEC_REGNO_P (REGNO) ? ALTIVEC_REGS \ - : (REGNO) == CR0_REGNO ? CR0_REGS \ - : CR_REGNO_P (REGNO) ? CR_REGS \ - : (REGNO) == MQ_REGNO ? MQ_REGS \ - : (REGNO) == LR_REGNO ? LINK_REGS \ - : (REGNO) == CTR_REGNO ? CTR_REGS \ - : (REGNO) == ARG_POINTER_REGNUM ? BASE_REGS \ - : (REGNO) == XER_REGNO ? XER_REGS \ - : (REGNO) == VRSAVE_REGNO ? VRSAVE_REGS \ - : (REGNO) == VSCR_REGNO ? VRSAVE_REGS \ - : (REGNO) == SPE_ACC_REGNO ? SPE_ACC_REGS \ - : (REGNO) == SPEFSCR_REGNO ? SPEFSCR_REGS \ - : (REGNO) == FRAME_POINTER_REGNUM ? BASE_REGS \ - : NO_REGS) +extern enum reg_class rs6000_regno_regclass[FIRST_PSEUDO_REGISTER]; + +#if ENABLE_CHECKING +#define REGNO_REG_CLASS(REGNO) \ + (gcc_assert (IN_RANGE ((REGNO), 0, FIRST_PSEUDO_REGISTER-1)), \ + rs6000_regno_regclass[(REGNO)]) + +#else +#define REGNO_REG_CLASS(REGNO) rs6000_regno_regclass[(REGNO)] +#endif + +/* Register classes for various constraints that are based on the target + switches. */ +enum r6000_reg_class_enum { + RS6000_CONSTRAINT_d, /* fpr registers for double values */ + RS6000_CONSTRAINT_f, /* fpr registers for single values */ + RS6000_CONSTRAINT_v, /* Altivec registers */ + RS6000_CONSTRAINT_wa, /* Any VSX register */ + RS6000_CONSTRAINT_wd, /* VSX register for V2DF */ + RS6000_CONSTRAINT_wf, /* VSX register for V4SF */ + RS6000_CONSTRAINT_ws, /* VSX register for DF */ + RS6000_CONSTRAINT_MAX +}; + +extern enum reg_class rs6000_constraints[RS6000_CONSTRAINT_MAX]; /* The class value for index registers, and the one for base regs. */ #define INDEX_REG_CLASS GENERAL_REGS #define BASE_REG_CLASS BASE_REGS +/* Return whether a given register class can hold VSX objects. */ +#define VSX_REG_CLASS_P(CLASS) \ + ((CLASS) == VSX_REGS || (CLASS) == FLOAT_REGS || (CLASS) == ALTIVEC_REGS) + /* Given an rtx X being reloaded into a reg required to be in class CLASS, return the class of reg to actually use. In general this is just CLASS; but on some machines @@ -1165,20 +1348,14 @@ enum reg_class */ #define PREFERRED_RELOAD_CLASS(X,CLASS) \ - ((CONSTANT_P (X) \ - && reg_classes_intersect_p ((CLASS), FLOAT_REGS)) \ - ? NO_REGS \ - : (GET_MODE_CLASS (GET_MODE (X)) == MODE_INT \ - && (CLASS) == NON_SPECIAL_REGS) \ - ? GENERAL_REGS \ - : (CLASS)) + rs6000_preferred_reload_class_ptr (X, CLASS) /* Return the register class of a scratch register needed to copy IN into or out of a register in CLASS in MODE. If it can be done directly, NO_REGS is returned. */ #define SECONDARY_RELOAD_CLASS(CLASS,MODE,IN) \ - rs6000_secondary_reload_class (CLASS, MODE, IN) + rs6000_secondary_reload_class_ptr (CLASS, MODE, IN) /* If we are copying between FP or AltiVec registers and anything else, we need a memory location. The exception is when we are @@ -1186,18 +1363,7 @@ enum reg_class are available.*/ #define SECONDARY_MEMORY_NEEDED(CLASS1,CLASS2,MODE) \ - ((CLASS1) != (CLASS2) && (((CLASS1) == FLOAT_REGS \ - && (!TARGET_MFPGPR || !TARGET_POWERPC64 \ - || ((MODE != DFmode) \ - && (MODE != DDmode) \ - && (MODE != DImode)))) \ - || ((CLASS2) == FLOAT_REGS \ - && (!TARGET_MFPGPR || !TARGET_POWERPC64 \ - || ((MODE != DFmode) \ - && (MODE != DDmode) \ - && (MODE != DImode)))) \ - || (CLASS1) == ALTIVEC_REGS \ - || (CLASS2) == ALTIVEC_REGS)) + rs6000_secondary_memory_needed_ptr (CLASS1, CLASS2, MODE) /* For cpus that cannot load/store SDmode values from the 64-bit FP registers without using a full 64-bit load/store, we need @@ -1209,32 +1375,15 @@ enum reg_class /* Return the maximum number of consecutive registers needed to represent mode MODE in a register of class CLASS. - On RS/6000, this is the size of MODE in words, - except in the FP regs, where a single reg is enough for two words. */ -#define CLASS_MAX_NREGS(CLASS, MODE) \ - (((CLASS) == FLOAT_REGS) \ - ? ((GET_MODE_SIZE (MODE) + UNITS_PER_FP_WORD - 1) / UNITS_PER_FP_WORD) \ - : (TARGET_E500_DOUBLE && (CLASS) == GENERAL_REGS \ - && ((MODE) == DFmode || (MODE) == DDmode)) \ - ? 1 \ - : ((GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)) + On RS/6000, this is the size of MODE in words, except in the FP regs, where + a single reg is enough for two words, unless we have VSX, where the FP + registers can hold 128 bits. */ +#define CLASS_MAX_NREGS(CLASS, MODE) rs6000_class_max_nregs[(MODE)][(CLASS)] /* Return nonzero if for CLASS a mode change from FROM to TO is invalid. */ #define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \ - (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \ - ? ((GET_MODE_SIZE (FROM) < 8 || GET_MODE_SIZE (TO) < 8 \ - || TARGET_IEEEQUAD) \ - && reg_classes_intersect_p (FLOAT_REGS, CLASS)) \ - : (((TARGET_E500_DOUBLE \ - && ((((TO) == DFmode) + ((FROM) == DFmode)) == 1 \ - || (((TO) == TFmode) + ((FROM) == TFmode)) == 1 \ - || (((TO) == DDmode) + ((FROM) == DDmode)) == 1 \ - || (((TO) == TDmode) + ((FROM) == TDmode)) == 1 \ - || (((TO) == DImode) + ((FROM) == DImode)) == 1)) \ - || (TARGET_SPE \ - && (SPE_VECTOR_MODE (FROM) + SPE_VECTOR_MODE (TO)) == 1)) \ - && reg_classes_intersect_p (GENERAL_REGS, CLASS))) + rs6000_cannot_change_mode_class_ptr (FROM, TO, CLASS) /* Stack layout; function entry, exit and calling. */ @@ -1296,7 +1445,7 @@ extern enum rs6000_abi rs6000_current_ab (FRAME_GROWS_DOWNWARD \ ? 0 \ : (RS6000_ALIGN (current_function_outgoing_args_size, \ - TARGET_ALTIVEC ? 16 : 8) \ + (TARGET_ALTIVEC || TARGET_VSX) ? 16 : 8) \ + RS6000_SAVE_AREA)) /* Offset from the stack pointer register to an item dynamically @@ -1307,7 +1456,7 @@ extern enum rs6000_abi rs6000_current_ab machines. See `function.c' for details. */ #define STACK_DYNAMIC_OFFSET(FUNDECL) \ (RS6000_ALIGN (current_function_outgoing_args_size, \ - TARGET_ALTIVEC ? 16 : 8) \ + (TARGET_ALTIVEC || TARGET_VSX) ? 16 : 8) \ + (STACK_POINTER_OFFSET)) /* If we generate an insn to push BYTES bytes, @@ -1709,6 +1858,10 @@ typedef struct rs6000_args && EASY_VECTOR_15((n) >> 1) \ && ((n) & 1) == 0) +#define EASY_VECTOR_MSB(n,mode) \ + (((unsigned HOST_WIDE_INT)n) == \ + ((((unsigned HOST_WIDE_INT)GET_MODE_MASK (mode)) + 1) >> 1)) + /* The macros REG_OK_FOR..._P assume that the arg is a REG rtx and check its validity for a certain class. We have two alternate definitions for each of them. @@ -1761,9 +1914,9 @@ typedef struct rs6000_args adjacent memory cells are accessed by adding word-sized offsets during assembly output. */ -#define GO_IF_LEGITIMATE_ADDRESS(MODE, X, ADDR) \ -{ if (rs6000_legitimate_address (MODE, X, REG_OK_STRICT_FLAG)) \ - goto ADDR; \ +#define GO_IF_LEGITIMATE_ADDRESS(MODE, X, ADDR) \ +{ if (rs6000_legitimate_address_ptr (MODE, X, REG_OK_STRICT_FLAG)) \ + goto ADDR; \ } /* Try machine-dependent ways of modifying an illegitimate address @@ -1790,8 +1943,8 @@ typedef struct rs6000_args load the other things into a register and return the sum. */ #define LEGITIMIZE_ADDRESS(X,OLDX,MODE,WIN) \ -{ rtx result = rs6000_legitimize_address (X, OLDX, MODE); \ - if (result != NULL_RTX) \ +{ rtx result = rs6000_legitimize_address_ptr (X, OLDX, MODE); \ + if (result != NULL_RTX && result != (X)) \ { \ (X) = result; \ goto WIN; \ @@ -1808,7 +1961,7 @@ typedef struct rs6000_args #define LEGITIMIZE_RELOAD_ADDRESS(X,MODE,OPNUM,TYPE,IND_LEVELS,WIN) \ do { \ int win; \ - (X) = rs6000_legitimize_reload_address ((X), (MODE), (OPNUM), \ + (X) = rs6000_legitimize_reload_address_ptr ((X), (MODE), (OPNUM), \ (int)(TYPE), (IND_LEVELS), &win); \ if ( win ) \ goto WIN; \ @@ -1819,9 +1972,11 @@ do { \ #define GO_IF_MODE_DEPENDENT_ADDRESS(ADDR,LABEL) \ do { \ - if (rs6000_mode_dependent_address (ADDR)) \ + if (rs6000_mode_dependent_address_ptr (ADDR)) \ goto LABEL; \ } while (0) + +#define FIND_BASE_TERM rs6000_find_base_term /* The register number of the register used to address a table of static data addresses in memory. In some cases this register is @@ -1919,7 +2074,8 @@ do { \ /* Specify the machine mode that pointers have. After generation of rtl, the compiler makes no further distinction between pointers and any other objects of this machine mode. */ -#define Pmode (TARGET_32BIT ? SImode : DImode) +extern unsigned rs6000_pmode; +#define Pmode ((enum machine_mode)rs6000_pmode) /* Supply definition of STACK_SIZE_MODE for allocate_dynamic_stack_space. */ #define STACK_SIZE_MODE (TARGET_32BIT ? SImode : DImode) @@ -2266,7 +2422,24 @@ extern char rs6000_reg_names[][8]; /* re /* no additional names for: mq, lr, ctr, ap */ \ {"cr0", 68}, {"cr1", 69}, {"cr2", 70}, {"cr3", 71}, \ {"cr4", 72}, {"cr5", 73}, {"cr6", 74}, {"cr7", 75}, \ - {"cc", 68}, {"sp", 1}, {"toc", 2} } + {"cc", 68}, {"sp", 1}, {"toc", 2}, \ + /* VSX registers overlaid on top of FR, Altivec registers */ \ + {"vs0", 32}, {"vs1", 33}, {"vs2", 34}, {"vs3", 35}, \ + {"vs4", 36}, {"vs5", 37}, {"vs6", 38}, {"vs7", 39}, \ + {"vs8", 40}, {"vs9", 41}, {"vs10", 42}, {"vs11", 43}, \ + {"vs12", 44}, {"vs13", 45}, {"vs14", 46}, {"vs15", 47}, \ + {"vs16", 48}, {"vs17", 49}, {"vs18", 50}, {"vs19", 51}, \ + {"vs20", 52}, {"vs21", 53}, {"vs22", 54}, {"vs23", 55}, \ + {"vs24", 56}, {"vs25", 57}, {"vs26", 58}, {"vs27", 59}, \ + {"vs28", 60}, {"vs29", 61}, {"vs30", 62}, {"vs31", 63}, \ + {"vs32", 77}, {"vs33", 78}, {"vs34", 79}, {"vs35", 80}, \ + {"vs36", 81}, {"vs37", 82}, {"vs38", 83}, {"vs39", 84}, \ + {"vs40", 85}, {"vs41", 86}, {"vs42", 87}, {"vs43", 88}, \ + {"vs44", 89}, {"vs45", 90}, {"vs46", 91}, {"vs47", 92}, \ + {"vs48", 93}, {"vs49", 94}, {"vs50", 95}, {"vs51", 96}, \ + {"vs52", 97}, {"vs53", 98}, {"vs54", 99}, {"vs55", 100}, \ + {"vs56", 101},{"vs57", 102},{"vs58", 103},{"vs59", 104}, \ + {"vs60", 105},{"vs61", 106},{"vs62", 107},{"vs63", 108} } /* Text to write out after a CALL that may be replaced by glue code by the loader. This depends on the AIX version. */ @@ -2319,6 +2492,12 @@ extern char rs6000_reg_names[][8]; /* re #define PRINT_OPERAND_ADDRESS(FILE, ADDR) print_operand_address (FILE, ADDR) +#define OUTPUT_ADDR_CONST_EXTRA(STREAM, X, FAIL) \ + do \ + if (!rs6000_output_addr_const_extra (STREAM, X)) \ + goto FAIL; \ + while (0) + /* uncomment for disabling the corresponding default options */ /* #define MACHINE_no_sched_interblock */ /* #define MACHINE_no_sched_speculative */ @@ -2330,733 +2509,31 @@ extern int optimize; extern int flag_expensive_optimizations; extern int frame_pointer_needed; +/* Classification of the builtin functions to properly set the declaration tree + flags. */ +enum rs6000_btc +{ + RS6000_BTC_MISC, /* assume builtin can do anything */ + RS6000_BTC_CONST, /* builtin is a 'const' function. */ + RS6000_BTC_PURE, /* builtin is a 'pure' function. */ + RS6000_BTC_FP_PURE /* builtin is 'pure' if rounding math. */ +}; + +#undef RS6000_BUILTIN +#undef RS6000_BUILTIN_EQUATE +#define RS6000_BUILTIN(NAME, TYPE) NAME, +#define RS6000_BUILTIN_EQUATE(NAME, VALUE) NAME = VALUE, + enum rs6000_builtins { - /* AltiVec builtins. */ - ALTIVEC_BUILTIN_ST_INTERNAL_4si, - ALTIVEC_BUILTIN_LD_INTERNAL_4si, - ALTIVEC_BUILTIN_ST_INTERNAL_8hi, - ALTIVEC_BUILTIN_LD_INTERNAL_8hi, - ALTIVEC_BUILTIN_ST_INTERNAL_16qi, - ALTIVEC_BUILTIN_LD_INTERNAL_16qi, - ALTIVEC_BUILTIN_ST_INTERNAL_4sf, - ALTIVEC_BUILTIN_LD_INTERNAL_4sf, - ALTIVEC_BUILTIN_VADDUBM, - ALTIVEC_BUILTIN_VADDUHM, - ALTIVEC_BUILTIN_VADDUWM, - ALTIVEC_BUILTIN_VADDFP, - ALTIVEC_BUILTIN_VADDCUW, - ALTIVEC_BUILTIN_VADDUBS, - ALTIVEC_BUILTIN_VADDSBS, - ALTIVEC_BUILTIN_VADDUHS, - ALTIVEC_BUILTIN_VADDSHS, - ALTIVEC_BUILTIN_VADDUWS, - ALTIVEC_BUILTIN_VADDSWS, - ALTIVEC_BUILTIN_VAND, - ALTIVEC_BUILTIN_VANDC, - ALTIVEC_BUILTIN_VAVGUB, - ALTIVEC_BUILTIN_VAVGSB, - ALTIVEC_BUILTIN_VAVGUH, - ALTIVEC_BUILTIN_VAVGSH, - ALTIVEC_BUILTIN_VAVGUW, - ALTIVEC_BUILTIN_VAVGSW, - ALTIVEC_BUILTIN_VCFUX, - ALTIVEC_BUILTIN_VCFSX, - ALTIVEC_BUILTIN_VCTSXS, - ALTIVEC_BUILTIN_VCTUXS, - ALTIVEC_BUILTIN_VCMPBFP, - ALTIVEC_BUILTIN_VCMPEQUB, - ALTIVEC_BUILTIN_VCMPEQUH, - ALTIVEC_BUILTIN_VCMPEQUW, - ALTIVEC_BUILTIN_VCMPEQFP, - ALTIVEC_BUILTIN_VCMPGEFP, - ALTIVEC_BUILTIN_VCMPGTUB, - ALTIVEC_BUILTIN_VCMPGTSB, - ALTIVEC_BUILTIN_VCMPGTUH, - ALTIVEC_BUILTIN_VCMPGTSH, - ALTIVEC_BUILTIN_VCMPGTUW, - ALTIVEC_BUILTIN_VCMPGTSW, - ALTIVEC_BUILTIN_VCMPGTFP, - ALTIVEC_BUILTIN_VEXPTEFP, - ALTIVEC_BUILTIN_VLOGEFP, - ALTIVEC_BUILTIN_VMADDFP, - ALTIVEC_BUILTIN_VMAXUB, - ALTIVEC_BUILTIN_VMAXSB, - ALTIVEC_BUILTIN_VMAXUH, - ALTIVEC_BUILTIN_VMAXSH, - ALTIVEC_BUILTIN_VMAXUW, - ALTIVEC_BUILTIN_VMAXSW, - ALTIVEC_BUILTIN_VMAXFP, - ALTIVEC_BUILTIN_VMHADDSHS, - ALTIVEC_BUILTIN_VMHRADDSHS, - ALTIVEC_BUILTIN_VMLADDUHM, - ALTIVEC_BUILTIN_VMRGHB, - ALTIVEC_BUILTIN_VMRGHH, - ALTIVEC_BUILTIN_VMRGHW, - ALTIVEC_BUILTIN_VMRGLB, - ALTIVEC_BUILTIN_VMRGLH, - ALTIVEC_BUILTIN_VMRGLW, - ALTIVEC_BUILTIN_VMSUMUBM, - ALTIVEC_BUILTIN_VMSUMMBM, - ALTIVEC_BUILTIN_VMSUMUHM, - ALTIVEC_BUILTIN_VMSUMSHM, - ALTIVEC_BUILTIN_VMSUMUHS, - ALTIVEC_BUILTIN_VMSUMSHS, - ALTIVEC_BUILTIN_VMINUB, - ALTIVEC_BUILTIN_VMINSB, - ALTIVEC_BUILTIN_VMINUH, - ALTIVEC_BUILTIN_VMINSH, - ALTIVEC_BUILTIN_VMINUW, - ALTIVEC_BUILTIN_VMINSW, - ALTIVEC_BUILTIN_VMINFP, - ALTIVEC_BUILTIN_VMULEUB, - ALTIVEC_BUILTIN_VMULESB, - ALTIVEC_BUILTIN_VMULEUH, - ALTIVEC_BUILTIN_VMULESH, - ALTIVEC_BUILTIN_VMULOUB, - ALTIVEC_BUILTIN_VMULOSB, - ALTIVEC_BUILTIN_VMULOUH, - ALTIVEC_BUILTIN_VMULOSH, - ALTIVEC_BUILTIN_VNMSUBFP, - ALTIVEC_BUILTIN_VNOR, - ALTIVEC_BUILTIN_VOR, - ALTIVEC_BUILTIN_VSEL_4SI, - ALTIVEC_BUILTIN_VSEL_4SF, - ALTIVEC_BUILTIN_VSEL_8HI, - ALTIVEC_BUILTIN_VSEL_16QI, - ALTIVEC_BUILTIN_VPERM_4SI, - ALTIVEC_BUILTIN_VPERM_4SF, - ALTIVEC_BUILTIN_VPERM_8HI, - ALTIVEC_BUILTIN_VPERM_16QI, - ALTIVEC_BUILTIN_VPKUHUM, - ALTIVEC_BUILTIN_VPKUWUM, - ALTIVEC_BUILTIN_VPKPX, - ALTIVEC_BUILTIN_VPKUHSS, - ALTIVEC_BUILTIN_VPKSHSS, - ALTIVEC_BUILTIN_VPKUWSS, - ALTIVEC_BUILTIN_VPKSWSS, - ALTIVEC_BUILTIN_VPKUHUS, - ALTIVEC_BUILTIN_VPKSHUS, - ALTIVEC_BUILTIN_VPKUWUS, - ALTIVEC_BUILTIN_VPKSWUS, - ALTIVEC_BUILTIN_VREFP, - ALTIVEC_BUILTIN_VRFIM, - ALTIVEC_BUILTIN_VRFIN, - ALTIVEC_BUILTIN_VRFIP, - ALTIVEC_BUILTIN_VRFIZ, - ALTIVEC_BUILTIN_VRLB, - ALTIVEC_BUILTIN_VRLH, - ALTIVEC_BUILTIN_VRLW, - ALTIVEC_BUILTIN_VRSQRTEFP, - ALTIVEC_BUILTIN_VSLB, - ALTIVEC_BUILTIN_VSLH, - ALTIVEC_BUILTIN_VSLW, - ALTIVEC_BUILTIN_VSL, - ALTIVEC_BUILTIN_VSLO, - ALTIVEC_BUILTIN_VSPLTB, - ALTIVEC_BUILTIN_VSPLTH, - ALTIVEC_BUILTIN_VSPLTW, - ALTIVEC_BUILTIN_VSPLTISB, - ALTIVEC_BUILTIN_VSPLTISH, - ALTIVEC_BUILTIN_VSPLTISW, - ALTIVEC_BUILTIN_VSRB, - ALTIVEC_BUILTIN_VSRH, - ALTIVEC_BUILTIN_VSRW, - ALTIVEC_BUILTIN_VSRAB, - ALTIVEC_BUILTIN_VSRAH, - ALTIVEC_BUILTIN_VSRAW, - ALTIVEC_BUILTIN_VSR, - ALTIVEC_BUILTIN_VSRO, - ALTIVEC_BUILTIN_VSUBUBM, - ALTIVEC_BUILTIN_VSUBUHM, - ALTIVEC_BUILTIN_VSUBUWM, - ALTIVEC_BUILTIN_VSUBFP, - ALTIVEC_BUILTIN_VSUBCUW, - ALTIVEC_BUILTIN_VSUBUBS, - ALTIVEC_BUILTIN_VSUBSBS, - ALTIVEC_BUILTIN_VSUBUHS, - ALTIVEC_BUILTIN_VSUBSHS, - ALTIVEC_BUILTIN_VSUBUWS, - ALTIVEC_BUILTIN_VSUBSWS, - ALTIVEC_BUILTIN_VSUM4UBS, - ALTIVEC_BUILTIN_VSUM4SBS, - ALTIVEC_BUILTIN_VSUM4SHS, - ALTIVEC_BUILTIN_VSUM2SWS, - ALTIVEC_BUILTIN_VSUMSWS, - ALTIVEC_BUILTIN_VXOR, - ALTIVEC_BUILTIN_VSLDOI_16QI, - ALTIVEC_BUILTIN_VSLDOI_8HI, - ALTIVEC_BUILTIN_VSLDOI_4SI, - ALTIVEC_BUILTIN_VSLDOI_4SF, - ALTIVEC_BUILTIN_VUPKHSB, - ALTIVEC_BUILTIN_VUPKHPX, - ALTIVEC_BUILTIN_VUPKHSH, - ALTIVEC_BUILTIN_VUPKLSB, - ALTIVEC_BUILTIN_VUPKLPX, - ALTIVEC_BUILTIN_VUPKLSH, - ALTIVEC_BUILTIN_MTVSCR, - ALTIVEC_BUILTIN_MFVSCR, - ALTIVEC_BUILTIN_DSSALL, - ALTIVEC_BUILTIN_DSS, - ALTIVEC_BUILTIN_LVSL, - ALTIVEC_BUILTIN_LVSR, - ALTIVEC_BUILTIN_DSTT, - ALTIVEC_BUILTIN_DSTST, - ALTIVEC_BUILTIN_DSTSTT, - ALTIVEC_BUILTIN_DST, - ALTIVEC_BUILTIN_LVEBX, - ALTIVEC_BUILTIN_LVEHX, - ALTIVEC_BUILTIN_LVEWX, - ALTIVEC_BUILTIN_LVXL, - ALTIVEC_BUILTIN_LVX, - ALTIVEC_BUILTIN_STVX, - ALTIVEC_BUILTIN_LVLX, - ALTIVEC_BUILTIN_LVLXL, - ALTIVEC_BUILTIN_LVRX, - ALTIVEC_BUILTIN_LVRXL, - ALTIVEC_BUILTIN_STVEBX, - ALTIVEC_BUILTIN_STVEHX, - ALTIVEC_BUILTIN_STVEWX, - ALTIVEC_BUILTIN_STVXL, - ALTIVEC_BUILTIN_STVLX, - ALTIVEC_BUILTIN_STVLXL, - ALTIVEC_BUILTIN_STVRX, - ALTIVEC_BUILTIN_STVRXL, - ALTIVEC_BUILTIN_VCMPBFP_P, - ALTIVEC_BUILTIN_VCMPEQFP_P, - ALTIVEC_BUILTIN_VCMPEQUB_P, - ALTIVEC_BUILTIN_VCMPEQUH_P, - ALTIVEC_BUILTIN_VCMPEQUW_P, - ALTIVEC_BUILTIN_VCMPGEFP_P, - ALTIVEC_BUILTIN_VCMPGTFP_P, - ALTIVEC_BUILTIN_VCMPGTSB_P, - ALTIVEC_BUILTIN_VCMPGTSH_P, - ALTIVEC_BUILTIN_VCMPGTSW_P, - ALTIVEC_BUILTIN_VCMPGTUB_P, - ALTIVEC_BUILTIN_VCMPGTUH_P, - ALTIVEC_BUILTIN_VCMPGTUW_P, - ALTIVEC_BUILTIN_ABSS_V4SI, - ALTIVEC_BUILTIN_ABSS_V8HI, - ALTIVEC_BUILTIN_ABSS_V16QI, - ALTIVEC_BUILTIN_ABS_V4SI, - ALTIVEC_BUILTIN_ABS_V4SF, - ALTIVEC_BUILTIN_ABS_V8HI, - ALTIVEC_BUILTIN_ABS_V16QI, - ALTIVEC_BUILTIN_MASK_FOR_LOAD, - ALTIVEC_BUILTIN_MASK_FOR_STORE, - ALTIVEC_BUILTIN_VEC_INIT_V4SI, - ALTIVEC_BUILTIN_VEC_INIT_V8HI, - ALTIVEC_BUILTIN_VEC_INIT_V16QI, - ALTIVEC_BUILTIN_VEC_INIT_V4SF, - ALTIVEC_BUILTIN_VEC_SET_V4SI, - ALTIVEC_BUILTIN_VEC_SET_V8HI, - ALTIVEC_BUILTIN_VEC_SET_V16QI, - ALTIVEC_BUILTIN_VEC_SET_V4SF, - ALTIVEC_BUILTIN_VEC_EXT_V4SI, - ALTIVEC_BUILTIN_VEC_EXT_V8HI, - ALTIVEC_BUILTIN_VEC_EXT_V16QI, - ALTIVEC_BUILTIN_VEC_EXT_V4SF, - - /* Altivec overloaded builtins. */ - ALTIVEC_BUILTIN_VCMPEQ_P, - ALTIVEC_BUILTIN_OVERLOADED_FIRST = ALTIVEC_BUILTIN_VCMPEQ_P, - ALTIVEC_BUILTIN_VCMPGT_P, - ALTIVEC_BUILTIN_VCMPGE_P, - ALTIVEC_BUILTIN_VEC_ABS, - ALTIVEC_BUILTIN_VEC_ABSS, - ALTIVEC_BUILTIN_VEC_ADD, - ALTIVEC_BUILTIN_VEC_ADDC, - ALTIVEC_BUILTIN_VEC_ADDS, - ALTIVEC_BUILTIN_VEC_AND, - ALTIVEC_BUILTIN_VEC_ANDC, - ALTIVEC_BUILTIN_VEC_AVG, - ALTIVEC_BUILTIN_VEC_EXTRACT, - ALTIVEC_BUILTIN_VEC_CEIL, - ALTIVEC_BUILTIN_VEC_CMPB, - ALTIVEC_BUILTIN_VEC_CMPEQ, - ALTIVEC_BUILTIN_VEC_CMPEQUB, - ALTIVEC_BUILTIN_VEC_CMPEQUH, - ALTIVEC_BUILTIN_VEC_CMPEQUW, - ALTIVEC_BUILTIN_VEC_CMPGE, - ALTIVEC_BUILTIN_VEC_CMPGT, - ALTIVEC_BUILTIN_VEC_CMPLE, - ALTIVEC_BUILTIN_VEC_CMPLT, - ALTIVEC_BUILTIN_VEC_CTF, - ALTIVEC_BUILTIN_VEC_CTS, - ALTIVEC_BUILTIN_VEC_CTU, - ALTIVEC_BUILTIN_VEC_DST, - ALTIVEC_BUILTIN_VEC_DSTST, - ALTIVEC_BUILTIN_VEC_DSTSTT, - ALTIVEC_BUILTIN_VEC_DSTT, - ALTIVEC_BUILTIN_VEC_EXPTE, - ALTIVEC_BUILTIN_VEC_FLOOR, - ALTIVEC_BUILTIN_VEC_LD, - ALTIVEC_BUILTIN_VEC_LDE, - ALTIVEC_BUILTIN_VEC_LDL, - ALTIVEC_BUILTIN_VEC_LOGE, - ALTIVEC_BUILTIN_VEC_LVEBX, - ALTIVEC_BUILTIN_VEC_LVEHX, - ALTIVEC_BUILTIN_VEC_LVEWX, - ALTIVEC_BUILTIN_VEC_LVLX, - ALTIVEC_BUILTIN_VEC_LVLXL, - ALTIVEC_BUILTIN_VEC_LVRX, - ALTIVEC_BUILTIN_VEC_LVRXL, - ALTIVEC_BUILTIN_VEC_LVSL, - ALTIVEC_BUILTIN_VEC_LVSR, - ALTIVEC_BUILTIN_VEC_MADD, - ALTIVEC_BUILTIN_VEC_MADDS, - ALTIVEC_BUILTIN_VEC_MAX, - ALTIVEC_BUILTIN_VEC_MERGEH, - ALTIVEC_BUILTIN_VEC_MERGEL, - ALTIVEC_BUILTIN_VEC_MIN, - ALTIVEC_BUILTIN_VEC_MLADD, - ALTIVEC_BUILTIN_VEC_MPERM, - ALTIVEC_BUILTIN_VEC_MRADDS, - ALTIVEC_BUILTIN_VEC_MRGHB, - ALTIVEC_BUILTIN_VEC_MRGHH, - ALTIVEC_BUILTIN_VEC_MRGHW, - ALTIVEC_BUILTIN_VEC_MRGLB, - ALTIVEC_BUILTIN_VEC_MRGLH, - ALTIVEC_BUILTIN_VEC_MRGLW, - ALTIVEC_BUILTIN_VEC_MSUM, - ALTIVEC_BUILTIN_VEC_MSUMS, - ALTIVEC_BUILTIN_VEC_MTVSCR, - ALTIVEC_BUILTIN_VEC_MULE, - ALTIVEC_BUILTIN_VEC_MULO, - ALTIVEC_BUILTIN_VEC_NMSUB, - ALTIVEC_BUILTIN_VEC_NOR, - ALTIVEC_BUILTIN_VEC_OR, - ALTIVEC_BUILTIN_VEC_PACK, - ALTIVEC_BUILTIN_VEC_PACKPX, - ALTIVEC_BUILTIN_VEC_PACKS, - ALTIVEC_BUILTIN_VEC_PACKSU, - ALTIVEC_BUILTIN_VEC_PERM, - ALTIVEC_BUILTIN_VEC_RE, - ALTIVEC_BUILTIN_VEC_RL, - ALTIVEC_BUILTIN_VEC_ROUND, - ALTIVEC_BUILTIN_VEC_RSQRTE, - ALTIVEC_BUILTIN_VEC_SEL, - ALTIVEC_BUILTIN_VEC_SL, - ALTIVEC_BUILTIN_VEC_SLD, - ALTIVEC_BUILTIN_VEC_SLL, - ALTIVEC_BUILTIN_VEC_SLO, - ALTIVEC_BUILTIN_VEC_SPLAT, - ALTIVEC_BUILTIN_VEC_SPLAT_S16, - ALTIVEC_BUILTIN_VEC_SPLAT_S32, - ALTIVEC_BUILTIN_VEC_SPLAT_S8, - ALTIVEC_BUILTIN_VEC_SPLAT_U16, - ALTIVEC_BUILTIN_VEC_SPLAT_U32, - ALTIVEC_BUILTIN_VEC_SPLAT_U8, - ALTIVEC_BUILTIN_VEC_SPLTB, - ALTIVEC_BUILTIN_VEC_SPLTH, - ALTIVEC_BUILTIN_VEC_SPLTW, - ALTIVEC_BUILTIN_VEC_SR, - ALTIVEC_BUILTIN_VEC_SRA, - ALTIVEC_BUILTIN_VEC_SRL, - ALTIVEC_BUILTIN_VEC_SRO, - ALTIVEC_BUILTIN_VEC_ST, - ALTIVEC_BUILTIN_VEC_STE, - ALTIVEC_BUILTIN_VEC_STL, - ALTIVEC_BUILTIN_VEC_STVEBX, - ALTIVEC_BUILTIN_VEC_STVEHX, - ALTIVEC_BUILTIN_VEC_STVEWX, - ALTIVEC_BUILTIN_VEC_STVLX, - ALTIVEC_BUILTIN_VEC_STVLXL, - ALTIVEC_BUILTIN_VEC_STVRX, - ALTIVEC_BUILTIN_VEC_STVRXL, - ALTIVEC_BUILTIN_VEC_SUB, - ALTIVEC_BUILTIN_VEC_SUBC, - ALTIVEC_BUILTIN_VEC_SUBS, - ALTIVEC_BUILTIN_VEC_SUM2S, - ALTIVEC_BUILTIN_VEC_SUM4S, - ALTIVEC_BUILTIN_VEC_SUMS, - ALTIVEC_BUILTIN_VEC_TRUNC, - ALTIVEC_BUILTIN_VEC_UNPACKH, - ALTIVEC_BUILTIN_VEC_UNPACKL, - ALTIVEC_BUILTIN_VEC_VADDFP, - ALTIVEC_BUILTIN_VEC_VADDSBS, - ALTIVEC_BUILTIN_VEC_VADDSHS, - ALTIVEC_BUILTIN_VEC_VADDSWS, - ALTIVEC_BUILTIN_VEC_VADDUBM, - ALTIVEC_BUILTIN_VEC_VADDUBS, - ALTIVEC_BUILTIN_VEC_VADDUHM, - ALTIVEC_BUILTIN_VEC_VADDUHS, - ALTIVEC_BUILTIN_VEC_VADDUWM, - ALTIVEC_BUILTIN_VEC_VADDUWS, - ALTIVEC_BUILTIN_VEC_VAVGSB, - ALTIVEC_BUILTIN_VEC_VAVGSH, - ALTIVEC_BUILTIN_VEC_VAVGSW, - ALTIVEC_BUILTIN_VEC_VAVGUB, - ALTIVEC_BUILTIN_VEC_VAVGUH, - ALTIVEC_BUILTIN_VEC_VAVGUW, - ALTIVEC_BUILTIN_VEC_VCFSX, - ALTIVEC_BUILTIN_VEC_VCFUX, - ALTIVEC_BUILTIN_VEC_VCMPEQFP, - ALTIVEC_BUILTIN_VEC_VCMPEQUB, - ALTIVEC_BUILTIN_VEC_VCMPEQUH, - ALTIVEC_BUILTIN_VEC_VCMPEQUW, - ALTIVEC_BUILTIN_VEC_VCMPGTFP, - ALTIVEC_BUILTIN_VEC_VCMPGTSB, - ALTIVEC_BUILTIN_VEC_VCMPGTSH, - ALTIVEC_BUILTIN_VEC_VCMPGTSW, - ALTIVEC_BUILTIN_VEC_VCMPGTUB, - ALTIVEC_BUILTIN_VEC_VCMPGTUH, - ALTIVEC_BUILTIN_VEC_VCMPGTUW, - ALTIVEC_BUILTIN_VEC_VMAXFP, - ALTIVEC_BUILTIN_VEC_VMAXSB, - ALTIVEC_BUILTIN_VEC_VMAXSH, - ALTIVEC_BUILTIN_VEC_VMAXSW, - ALTIVEC_BUILTIN_VEC_VMAXUB, - ALTIVEC_BUILTIN_VEC_VMAXUH, - ALTIVEC_BUILTIN_VEC_VMAXUW, - ALTIVEC_BUILTIN_VEC_VMINFP, - ALTIVEC_BUILTIN_VEC_VMINSB, - ALTIVEC_BUILTIN_VEC_VMINSH, - ALTIVEC_BUILTIN_VEC_VMINSW, - ALTIVEC_BUILTIN_VEC_VMINUB, - ALTIVEC_BUILTIN_VEC_VMINUH, - ALTIVEC_BUILTIN_VEC_VMINUW, - ALTIVEC_BUILTIN_VEC_VMRGHB, - ALTIVEC_BUILTIN_VEC_VMRGHH, - ALTIVEC_BUILTIN_VEC_VMRGHW, - ALTIVEC_BUILTIN_VEC_VMRGLB, - ALTIVEC_BUILTIN_VEC_VMRGLH, - ALTIVEC_BUILTIN_VEC_VMRGLW, - ALTIVEC_BUILTIN_VEC_VMSUMMBM, - ALTIVEC_BUILTIN_VEC_VMSUMSHM, - ALTIVEC_BUILTIN_VEC_VMSUMSHS, - ALTIVEC_BUILTIN_VEC_VMSUMUBM, - ALTIVEC_BUILTIN_VEC_VMSUMUHM, - ALTIVEC_BUILTIN_VEC_VMSUMUHS, - ALTIVEC_BUILTIN_VEC_VMULESB, - ALTIVEC_BUILTIN_VEC_VMULESH, - ALTIVEC_BUILTIN_VEC_VMULEUB, - ALTIVEC_BUILTIN_VEC_VMULEUH, - ALTIVEC_BUILTIN_VEC_VMULOSB, - ALTIVEC_BUILTIN_VEC_VMULOSH, - ALTIVEC_BUILTIN_VEC_VMULOUB, - ALTIVEC_BUILTIN_VEC_VMULOUH, - ALTIVEC_BUILTIN_VEC_VPKSHSS, - ALTIVEC_BUILTIN_VEC_VPKSHUS, - ALTIVEC_BUILTIN_VEC_VPKSWSS, - ALTIVEC_BUILTIN_VEC_VPKSWUS, - ALTIVEC_BUILTIN_VEC_VPKUHUM, - ALTIVEC_BUILTIN_VEC_VPKUHUS, - ALTIVEC_BUILTIN_VEC_VPKUWUM, - ALTIVEC_BUILTIN_VEC_VPKUWUS, - ALTIVEC_BUILTIN_VEC_VRLB, - ALTIVEC_BUILTIN_VEC_VRLH, - ALTIVEC_BUILTIN_VEC_VRLW, - ALTIVEC_BUILTIN_VEC_VSLB, - ALTIVEC_BUILTIN_VEC_VSLH, - ALTIVEC_BUILTIN_VEC_VSLW, - ALTIVEC_BUILTIN_VEC_VSPLTB, - ALTIVEC_BUILTIN_VEC_VSPLTH, - ALTIVEC_BUILTIN_VEC_VSPLTW, - ALTIVEC_BUILTIN_VEC_VSRAB, - ALTIVEC_BUILTIN_VEC_VSRAH, - ALTIVEC_BUILTIN_VEC_VSRAW, - ALTIVEC_BUILTIN_VEC_VSRB, - ALTIVEC_BUILTIN_VEC_VSRH, - ALTIVEC_BUILTIN_VEC_VSRW, - ALTIVEC_BUILTIN_VEC_VSUBFP, - ALTIVEC_BUILTIN_VEC_VSUBSBS, - ALTIVEC_BUILTIN_VEC_VSUBSHS, - ALTIVEC_BUILTIN_VEC_VSUBSWS, - ALTIVEC_BUILTIN_VEC_VSUBUBM, - ALTIVEC_BUILTIN_VEC_VSUBUBS, - ALTIVEC_BUILTIN_VEC_VSUBUHM, - ALTIVEC_BUILTIN_VEC_VSUBUHS, - ALTIVEC_BUILTIN_VEC_VSUBUWM, - ALTIVEC_BUILTIN_VEC_VSUBUWS, - ALTIVEC_BUILTIN_VEC_VSUM4SBS, - ALTIVEC_BUILTIN_VEC_VSUM4SHS, - ALTIVEC_BUILTIN_VEC_VSUM4UBS, - ALTIVEC_BUILTIN_VEC_VUPKHPX, - ALTIVEC_BUILTIN_VEC_VUPKHSB, - ALTIVEC_BUILTIN_VEC_VUPKHSH, - ALTIVEC_BUILTIN_VEC_VUPKLPX, - ALTIVEC_BUILTIN_VEC_VUPKLSB, - ALTIVEC_BUILTIN_VEC_VUPKLSH, - ALTIVEC_BUILTIN_VEC_XOR, - ALTIVEC_BUILTIN_VEC_STEP, - ALTIVEC_BUILTIN_VEC_PROMOTE, - ALTIVEC_BUILTIN_VEC_INSERT, - ALTIVEC_BUILTIN_VEC_SPLATS, - ALTIVEC_BUILTIN_OVERLOADED_LAST = ALTIVEC_BUILTIN_VEC_SPLATS, - - /* SPE builtins. */ - SPE_BUILTIN_EVADDW, - SPE_BUILTIN_EVAND, - SPE_BUILTIN_EVANDC, - SPE_BUILTIN_EVDIVWS, - SPE_BUILTIN_EVDIVWU, - SPE_BUILTIN_EVEQV, - SPE_BUILTIN_EVFSADD, - SPE_BUILTIN_EVFSDIV, - SPE_BUILTIN_EVFSMUL, - SPE_BUILTIN_EVFSSUB, - SPE_BUILTIN_EVLDDX, - SPE_BUILTIN_EVLDHX, - SPE_BUILTIN_EVLDWX, - SPE_BUILTIN_EVLHHESPLATX, - SPE_BUILTIN_EVLHHOSSPLATX, - SPE_BUILTIN_EVLHHOUSPLATX, - SPE_BUILTIN_EVLWHEX, - SPE_BUILTIN_EVLWHOSX, - SPE_BUILTIN_EVLWHOUX, - SPE_BUILTIN_EVLWHSPLATX, - SPE_BUILTIN_EVLWWSPLATX, - SPE_BUILTIN_EVMERGEHI, - SPE_BUILTIN_EVMERGEHILO, - SPE_BUILTIN_EVMERGELO, - SPE_BUILTIN_EVMERGELOHI, - SPE_BUILTIN_EVMHEGSMFAA, - SPE_BUILTIN_EVMHEGSMFAN, - SPE_BUILTIN_EVMHEGSMIAA, - SPE_BUILTIN_EVMHEGSMIAN, - SPE_BUILTIN_EVMHEGUMIAA, - SPE_BUILTIN_EVMHEGUMIAN, - SPE_BUILTIN_EVMHESMF, - SPE_BUILTIN_EVMHESMFA, - SPE_BUILTIN_EVMHESMFAAW, - SPE_BUILTIN_EVMHESMFANW, - SPE_BUILTIN_EVMHESMI, - SPE_BUILTIN_EVMHESMIA, - SPE_BUILTIN_EVMHESMIAAW, - SPE_BUILTIN_EVMHESMIANW, - SPE_BUILTIN_EVMHESSF, - SPE_BUILTIN_EVMHESSFA, - SPE_BUILTIN_EVMHESSFAAW, - SPE_BUILTIN_EVMHESSFANW, - SPE_BUILTIN_EVMHESSIAAW, - SPE_BUILTIN_EVMHESSIANW, - SPE_BUILTIN_EVMHEUMI, - SPE_BUILTIN_EVMHEUMIA, - SPE_BUILTIN_EVMHEUMIAAW, - SPE_BUILTIN_EVMHEUMIANW, - SPE_BUILTIN_EVMHEUSIAAW, - SPE_BUILTIN_EVMHEUSIANW, - SPE_BUILTIN_EVMHOGSMFAA, - SPE_BUILTIN_EVMHOGSMFAN, - SPE_BUILTIN_EVMHOGSMIAA, - SPE_BUILTIN_EVMHOGSMIAN, - SPE_BUILTIN_EVMHOGUMIAA, - SPE_BUILTIN_EVMHOGUMIAN, - SPE_BUILTIN_EVMHOSMF, - SPE_BUILTIN_EVMHOSMFA, - SPE_BUILTIN_EVMHOSMFAAW, - SPE_BUILTIN_EVMHOSMFANW, - SPE_BUILTIN_EVMHOSMI, - SPE_BUILTIN_EVMHOSMIA, - SPE_BUILTIN_EVMHOSMIAAW, - SPE_BUILTIN_EVMHOSMIANW, - SPE_BUILTIN_EVMHOSSF, - SPE_BUILTIN_EVMHOSSFA, - SPE_BUILTIN_EVMHOSSFAAW, - SPE_BUILTIN_EVMHOSSFANW, - SPE_BUILTIN_EVMHOSSIAAW, - SPE_BUILTIN_EVMHOSSIANW, - SPE_BUILTIN_EVMHOUMI, - SPE_BUILTIN_EVMHOUMIA, - SPE_BUILTIN_EVMHOUMIAAW, - SPE_BUILTIN_EVMHOUMIANW, - SPE_BUILTIN_EVMHOUSIAAW, - SPE_BUILTIN_EVMHOUSIANW, - SPE_BUILTIN_EVMWHSMF, - SPE_BUILTIN_EVMWHSMFA, - SPE_BUILTIN_EVMWHSMI, - SPE_BUILTIN_EVMWHSMIA, - SPE_BUILTIN_EVMWHSSF, - SPE_BUILTIN_EVMWHSSFA, - SPE_BUILTIN_EVMWHUMI, - SPE_BUILTIN_EVMWHUMIA, - SPE_BUILTIN_EVMWLSMIAAW, - SPE_BUILTIN_EVMWLSMIANW, - SPE_BUILTIN_EVMWLSSIAAW, - SPE_BUILTIN_EVMWLSSIANW, - SPE_BUILTIN_EVMWLUMI, - SPE_BUILTIN_EVMWLUMIA, - SPE_BUILTIN_EVMWLUMIAAW, - SPE_BUILTIN_EVMWLUMIANW, - SPE_BUILTIN_EVMWLUSIAAW, - SPE_BUILTIN_EVMWLUSIANW, - SPE_BUILTIN_EVMWSMF, - SPE_BUILTIN_EVMWSMFA, - SPE_BUILTIN_EVMWSMFAA, - SPE_BUILTIN_EVMWSMFAN, - SPE_BUILTIN_EVMWSMI, - SPE_BUILTIN_EVMWSMIA, - SPE_BUILTIN_EVMWSMIAA, - SPE_BUILTIN_EVMWSMIAN, - SPE_BUILTIN_EVMWHSSFAA, - SPE_BUILTIN_EVMWSSF, - SPE_BUILTIN_EVMWSSFA, - SPE_BUILTIN_EVMWSSFAA, - SPE_BUILTIN_EVMWSSFAN, - SPE_BUILTIN_EVMWUMI, - SPE_BUILTIN_EVMWUMIA, - SPE_BUILTIN_EVMWUMIAA, - SPE_BUILTIN_EVMWUMIAN, - SPE_BUILTIN_EVNAND, - SPE_BUILTIN_EVNOR, - SPE_BUILTIN_EVOR, - SPE_BUILTIN_EVORC, - SPE_BUILTIN_EVRLW, - SPE_BUILTIN_EVSLW, - SPE_BUILTIN_EVSRWS, - SPE_BUILTIN_EVSRWU, - SPE_BUILTIN_EVSTDDX, - SPE_BUILTIN_EVSTDHX, - SPE_BUILTIN_EVSTDWX, - SPE_BUILTIN_EVSTWHEX, - SPE_BUILTIN_EVSTWHOX, - SPE_BUILTIN_EVSTWWEX, - SPE_BUILTIN_EVSTWWOX, - SPE_BUILTIN_EVSUBFW, - SPE_BUILTIN_EVXOR, - SPE_BUILTIN_EVABS, - SPE_BUILTIN_EVADDSMIAAW, - SPE_BUILTIN_EVADDSSIAAW, - SPE_BUILTIN_EVADDUMIAAW, - SPE_BUILTIN_EVADDUSIAAW, - SPE_BUILTIN_EVCNTLSW, - SPE_BUILTIN_EVCNTLZW, - SPE_BUILTIN_EVEXTSB, - SPE_BUILTIN_EVEXTSH, - SPE_BUILTIN_EVFSABS, - SPE_BUILTIN_EVFSCFSF, - SPE_BUILTIN_EVFSCFSI, - SPE_BUILTIN_EVFSCFUF, - SPE_BUILTIN_EVFSCFUI, - SPE_BUILTIN_EVFSCTSF, - SPE_BUILTIN_EVFSCTSI, - SPE_BUILTIN_EVFSCTSIZ, - SPE_BUILTIN_EVFSCTUF, - SPE_BUILTIN_EVFSCTUI, - SPE_BUILTIN_EVFSCTUIZ, - SPE_BUILTIN_EVFSNABS, - SPE_BUILTIN_EVFSNEG, - SPE_BUILTIN_EVMRA, - SPE_BUILTIN_EVNEG, - SPE_BUILTIN_EVRNDW, - SPE_BUILTIN_EVSUBFSMIAAW, - SPE_BUILTIN_EVSUBFSSIAAW, - SPE_BUILTIN_EVSUBFUMIAAW, - SPE_BUILTIN_EVSUBFUSIAAW, - SPE_BUILTIN_EVADDIW, - SPE_BUILTIN_EVLDD, - SPE_BUILTIN_EVLDH, - SPE_BUILTIN_EVLDW, - SPE_BUILTIN_EVLHHESPLAT, - SPE_BUILTIN_EVLHHOSSPLAT, - SPE_BUILTIN_EVLHHOUSPLAT, - SPE_BUILTIN_EVLWHE, - SPE_BUILTIN_EVLWHOS, - SPE_BUILTIN_EVLWHOU, - SPE_BUILTIN_EVLWHSPLAT, - SPE_BUILTIN_EVLWWSPLAT, - SPE_BUILTIN_EVRLWI, - SPE_BUILTIN_EVSLWI, - SPE_BUILTIN_EVSRWIS, - SPE_BUILTIN_EVSRWIU, - SPE_BUILTIN_EVSTDD, - SPE_BUILTIN_EVSTDH, - SPE_BUILTIN_EVSTDW, - SPE_BUILTIN_EVSTWHE, - SPE_BUILTIN_EVSTWHO, - SPE_BUILTIN_EVSTWWE, - SPE_BUILTIN_EVSTWWO, - SPE_BUILTIN_EVSUBIFW, - - /* Compares. */ - SPE_BUILTIN_EVCMPEQ, - SPE_BUILTIN_EVCMPGTS, - SPE_BUILTIN_EVCMPGTU, - SPE_BUILTIN_EVCMPLTS, - SPE_BUILTIN_EVCMPLTU, - SPE_BUILTIN_EVFSCMPEQ, - SPE_BUILTIN_EVFSCMPGT, - SPE_BUILTIN_EVFSCMPLT, - SPE_BUILTIN_EVFSTSTEQ, - SPE_BUILTIN_EVFSTSTGT, - SPE_BUILTIN_EVFSTSTLT, - - /* EVSEL compares. */ - SPE_BUILTIN_EVSEL_CMPEQ, - SPE_BUILTIN_EVSEL_CMPGTS, - SPE_BUILTIN_EVSEL_CMPGTU, - SPE_BUILTIN_EVSEL_CMPLTS, - SPE_BUILTIN_EVSEL_CMPLTU, - SPE_BUILTIN_EVSEL_FSCMPEQ, - SPE_BUILTIN_EVSEL_FSCMPGT, - SPE_BUILTIN_EVSEL_FSCMPLT, - SPE_BUILTIN_EVSEL_FSTSTEQ, - SPE_BUILTIN_EVSEL_FSTSTGT, - SPE_BUILTIN_EVSEL_FSTSTLT, - - SPE_BUILTIN_EVSPLATFI, - SPE_BUILTIN_EVSPLATI, - SPE_BUILTIN_EVMWHSSMAA, - SPE_BUILTIN_EVMWHSMFAA, - SPE_BUILTIN_EVMWHSMIAA, - SPE_BUILTIN_EVMWHUSIAA, - SPE_BUILTIN_EVMWHUMIAA, - SPE_BUILTIN_EVMWHSSFAN, - SPE_BUILTIN_EVMWHSSIAN, - SPE_BUILTIN_EVMWHSMFAN, - SPE_BUILTIN_EVMWHSMIAN, - SPE_BUILTIN_EVMWHUSIAN, - SPE_BUILTIN_EVMWHUMIAN, - SPE_BUILTIN_EVMWHGSSFAA, - SPE_BUILTIN_EVMWHGSMFAA, - SPE_BUILTIN_EVMWHGSMIAA, - SPE_BUILTIN_EVMWHGUMIAA, - SPE_BUILTIN_EVMWHGSSFAN, - SPE_BUILTIN_EVMWHGSMFAN, - SPE_BUILTIN_EVMWHGSMIAN, - SPE_BUILTIN_EVMWHGUMIAN, - SPE_BUILTIN_MTSPEFSCR, - SPE_BUILTIN_MFSPEFSCR, - SPE_BUILTIN_BRINC, - - /* PAIRED builtins. */ - PAIRED_BUILTIN_DIVV2SF3, - PAIRED_BUILTIN_ABSV2SF2, - PAIRED_BUILTIN_NEGV2SF2, - PAIRED_BUILTIN_SQRTV2SF2, - PAIRED_BUILTIN_ADDV2SF3, - PAIRED_BUILTIN_SUBV2SF3, - PAIRED_BUILTIN_RESV2SF2, - PAIRED_BUILTIN_MULV2SF3, - PAIRED_BUILTIN_MSUB, - PAIRED_BUILTIN_MADD, - PAIRED_BUILTIN_NMSUB, - PAIRED_BUILTIN_NMADD, - PAIRED_BUILTIN_NABSV2SF2, - PAIRED_BUILTIN_SUM0, - PAIRED_BUILTIN_SUM1, - PAIRED_BUILTIN_MULS0, - PAIRED_BUILTIN_MULS1, - PAIRED_BUILTIN_MERGE00, - PAIRED_BUILTIN_MERGE01, - PAIRED_BUILTIN_MERGE10, - PAIRED_BUILTIN_MERGE11, - PAIRED_BUILTIN_MADDS0, - PAIRED_BUILTIN_MADDS1, - PAIRED_BUILTIN_STX, - PAIRED_BUILTIN_LX, - PAIRED_BUILTIN_SELV2SF4, - PAIRED_BUILTIN_CMPU0, - PAIRED_BUILTIN_CMPU1, - - RS6000_BUILTIN_RECIP, - RS6000_BUILTIN_RECIPF, - RS6000_BUILTIN_RSQRTF, +#include "rs6000-builtin.def" - RS6000_BUILTIN_COUNT + MAX_RS6000_BUILTINS }; +#undef RS6000_BUILTIN +#undef RS6000_BUILTIN_EQUATE + enum rs6000_builtin_type_index { RS6000_BTI_NOT_OPAQUE, @@ -3067,6 +2544,8 @@ enum rs6000_builtin_type_index RS6000_BTI_V16QI, RS6000_BTI_V2SI, RS6000_BTI_V2SF, + RS6000_BTI_V2DI, + RS6000_BTI_V2DF, RS6000_BTI_V4HI, RS6000_BTI_V4SI, RS6000_BTI_V4SF, @@ -3074,13 +2553,16 @@ enum rs6000_builtin_type_index RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_char, /* __bool char */ RS6000_BTI_bool_short, /* __bool short */ RS6000_BTI_bool_int, /* __bool int */ + RS6000_BTI_bool_long, /* __bool long */ RS6000_BTI_pixel, /* __pixel */ RS6000_BTI_bool_V16QI, /* __vector __bool char */ RS6000_BTI_bool_V8HI, /* __vector __bool short */ RS6000_BTI_bool_V4SI, /* __vector __bool int */ + RS6000_BTI_bool_V2DI, /* __vector __bool long */ RS6000_BTI_pixel_V8HI, /* __vector __pixel */ RS6000_BTI_long, /* long_integer_type_node */ RS6000_BTI_unsigned_long, /* long_unsigned_type_node */ @@ -3090,7 +2572,10 @@ enum rs6000_builtin_type_index RS6000_BTI_UINTHI, /* unsigned_intHI_type_node */ RS6000_BTI_INTSI, /* intSI_type_node */ RS6000_BTI_UINTSI, /* unsigned_intSI_type_node */ + RS6000_BTI_INTDI, /* intDI_type_node */ + RS6000_BTI_UINTDI, /* unsigned_intDI_type_node */ RS6000_BTI_float, /* float_type_node */ + RS6000_BTI_double, /* double_type_node */ RS6000_BTI_void, /* void_type_node */ RS6000_BTI_MAX }; @@ -3101,6 +2586,8 @@ enum rs6000_builtin_type_index #define opaque_p_V2SI_type_node (rs6000_builtin_types[RS6000_BTI_opaque_p_V2SI]) #define opaque_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_opaque_V4SI]) #define V16QI_type_node (rs6000_builtin_types[RS6000_BTI_V16QI]) +#define V2DI_type_node (rs6000_builtin_types[RS6000_BTI_V2DI]) +#define V2DF_type_node (rs6000_builtin_types[RS6000_BTI_V2DF]) #define V2SI_type_node (rs6000_builtin_types[RS6000_BTI_V2SI]) #define V2SF_type_node (rs6000_builtin_types[RS6000_BTI_V2SF]) #define V4HI_type_node (rs6000_builtin_types[RS6000_BTI_V4HI]) @@ -3110,13 +2597,16 @@ enum rs6000_builtin_type_index #define unsigned_V16QI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V16QI]) #define unsigned_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V8HI]) #define unsigned_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V4SI]) +#define unsigned_V2DI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V2DI]) #define bool_char_type_node (rs6000_builtin_types[RS6000_BTI_bool_char]) #define bool_short_type_node (rs6000_builtin_types[RS6000_BTI_bool_short]) #define bool_int_type_node (rs6000_builtin_types[RS6000_BTI_bool_int]) +#define bool_long_type_node (rs6000_builtin_types[RS6000_BTI_bool_long]) #define pixel_type_node (rs6000_builtin_types[RS6000_BTI_pixel]) #define bool_V16QI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V16QI]) #define bool_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V8HI]) #define bool_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V4SI]) +#define bool_V2DI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V2DI]) #define pixel_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_pixel_V8HI]) #define long_integer_type_internal_node (rs6000_builtin_types[RS6000_BTI_long]) @@ -3127,9 +2617,12 @@ enum rs6000_builtin_type_index #define uintHI_type_internal_node (rs6000_builtin_types[RS6000_BTI_UINTHI]) #define intSI_type_internal_node (rs6000_builtin_types[RS6000_BTI_INTSI]) #define uintSI_type_internal_node (rs6000_builtin_types[RS6000_BTI_UINTSI]) +#define intDI_type_internal_node (rs6000_builtin_types[RS6000_BTI_INTDI]) +#define uintDI_type_internal_node (rs6000_builtin_types[RS6000_BTI_UINTDI]) #define float_type_internal_node (rs6000_builtin_types[RS6000_BTI_float]) +#define double_type_internal_node (rs6000_builtin_types[RS6000_BTI_double]) #define void_type_internal_node (rs6000_builtin_types[RS6000_BTI_void]) extern GTY(()) tree rs6000_builtin_types[RS6000_BTI_MAX]; -extern GTY(()) tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT]; +extern GTY(()) tree rs6000_builtin_decls[MAX_RS6000_BUILTINS]; Index: gcc-4.3.4-20091019/gcc/config/rs6000/rs6000.c =================================================================== --- gcc-4.3.4-20091019.orig/gcc/config/rs6000/rs6000.c 2009-10-19 13:39:52.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/config/rs6000/rs6000.c 2009-10-19 13:40:37.000000000 +0200 @@ -1,6 +1,6 @@ /* Subroutines used for code generation on IBM RS/6000. Copyright (C) 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, - 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 + 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc. Contributed by Richard Kenner (kenner@vlsi1.ultra.nyu.edu) @@ -172,6 +172,9 @@ int rs6000_ieeequad; /* Nonzero to use AltiVec ABI. */ int rs6000_altivec_abi; +/* Nonzero if we want SPE SIMD instructions. */ +int rs6000_spe; + /* Nonzero if we want SPE ABI extensions. */ int rs6000_spe_abi; @@ -221,14 +224,38 @@ int dot_symbols; const char *rs6000_debug_name; int rs6000_debug_stack; /* debug stack applications */ int rs6000_debug_arg; /* debug argument handling */ +int rs6000_debug_reg; /* debug register classes */ +int rs6000_debug_addr; /* debug memory addressing */ +int rs6000_debug_cost; /* debug rtx_costs */ + +/* Specify the machine mode that pointers have. After generation of rtl, the + compiler makes no further distinction between pointers and any other objects + of this machine mode. The type is unsigned since not all things that + include rs6000.h also include machmode.h. */ +unsigned rs6000_pmode; + +/* Width in bits of a pointer. */ +unsigned rs6000_pointer_size; + /* Value is TRUE if register/mode pair is acceptable. */ bool rs6000_hard_regno_mode_ok_p[NUM_MACHINE_MODES][FIRST_PSEUDO_REGISTER]; -/* Built in types. */ +/* Maximum number of registers needed for a given register class and mode. */ +unsigned char rs6000_class_max_nregs[NUM_MACHINE_MODES][LIM_REG_CLASSES]; +/* How many registers are needed for a given register and mode. */ +unsigned char rs6000_hard_regno_nregs[NUM_MACHINE_MODES][FIRST_PSEUDO_REGISTER]; + +/* Map register number to register class. */ +enum reg_class rs6000_regno_regclass[FIRST_PSEUDO_REGISTER]; + +/* Reload functions based on the type and the vector unit. */ +static enum insn_code rs6000_vector_reload[NUM_MACHINE_MODES][2]; + +/* Built in types. */ tree rs6000_builtin_types[RS6000_BTI_MAX]; -tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT]; +tree rs6000_builtin_decls[MAX_RS6000_BUILTINS]; const char *rs6000_traceback_name; static enum { @@ -257,14 +284,13 @@ static GTY(()) section *toc_section; int rs6000_alignment_flags; /* True for any options that were explicitly set. */ -struct { +static struct { bool aix_struct_ret; /* True if -maix-struct-ret was used. */ bool alignment; /* True if -malign- was used. */ bool spe_abi; /* True if -mabi=spe/no-spe was used. */ bool altivec_abi; /* True if -mabi=altivec/no-altivec used. */ bool spe; /* True if -mspe= was used. */ bool float_gprs; /* True if -mfloat-gprs= was used. */ - bool isel; /* True if -misel was used. */ bool long_double; /* True if -mlong-double- was used. */ bool ieee; /* True if -mabi=ieee/ibmlongdouble used. */ bool vrsave; /* True if -mvrsave was used. */ @@ -280,6 +306,20 @@ struct builtin_description const char *const name; const enum rs6000_builtins code; }; + +/* Describe the vector unit used for modes. */ +enum rs6000_vector rs6000_vector_unit[NUM_MACHINE_MODES]; +enum rs6000_vector rs6000_vector_mem[NUM_MACHINE_MODES]; + +/* Register classes for various constraints that are based on the target + switches. */ +enum reg_class rs6000_constraints[RS6000_CONSTRAINT_MAX]; + +/* Describe the alignment of a vector. */ +int rs6000_vector_align[NUM_MACHINE_MODES]; + +/* Map selected modes to types for builtins. */ +static GTY(()) tree builtin_mode_to_type[MAX_MACHINE_MODE][2]; /* Target cpu costs. */ @@ -356,7 +396,7 @@ struct processor_costs rios1_cost = { COSTS_N_INSNS (2), /* dmul */ COSTS_N_INSNS (19), /* sdiv */ COSTS_N_INSNS (19), /* ddiv */ - 128, + 128, /* cache line size */ 64, /* l1 cache */ 512, /* l2 cache */ 0, /* streams */ @@ -375,7 +415,7 @@ struct processor_costs rios2_cost = { COSTS_N_INSNS (2), /* dmul */ COSTS_N_INSNS (17), /* sdiv */ COSTS_N_INSNS (17), /* ddiv */ - 256, + 256, /* cache line size */ 256, /* l1 cache */ 1024, /* l2 cache */ 0, /* streams */ @@ -394,7 +434,7 @@ struct processor_costs rs64a_cost = { COSTS_N_INSNS (4), /* dmul */ COSTS_N_INSNS (31), /* sdiv */ COSTS_N_INSNS (31), /* ddiv */ - 128, + 128, /* cache line size */ 128, /* l1 cache */ 2048, /* l2 cache */ 1, /* streams */ @@ -413,7 +453,7 @@ struct processor_costs mpccore_cost = { COSTS_N_INSNS (5), /* dmul */ COSTS_N_INSNS (10), /* sdiv */ COSTS_N_INSNS (17), /* ddiv */ - 32, + 32, /* cache line size */ 4, /* l1 cache */ 16, /* l2 cache */ 1, /* streams */ @@ -432,7 +472,7 @@ struct processor_costs ppc403_cost = { COSTS_N_INSNS (11), /* dmul */ COSTS_N_INSNS (11), /* sdiv */ COSTS_N_INSNS (11), /* ddiv */ - 32, + 32, /* cache line size */ 4, /* l1 cache */ 16, /* l2 cache */ 1, /* streams */ @@ -451,7 +491,7 @@ struct processor_costs ppc405_cost = { COSTS_N_INSNS (11), /* dmul */ COSTS_N_INSNS (11), /* sdiv */ COSTS_N_INSNS (11), /* ddiv */ - 32, + 32, /* cache line size */ 16, /* l1 cache */ 128, /* l2 cache */ 1, /* streams */ @@ -470,7 +510,7 @@ struct processor_costs ppc440_cost = { COSTS_N_INSNS (5), /* dmul */ COSTS_N_INSNS (19), /* sdiv */ COSTS_N_INSNS (33), /* ddiv */ - 32, + 32, /* cache line size */ 32, /* l1 cache */ 256, /* l2 cache */ 1, /* streams */ @@ -489,7 +529,7 @@ struct processor_costs ppc601_cost = { COSTS_N_INSNS (5), /* dmul */ COSTS_N_INSNS (17), /* sdiv */ COSTS_N_INSNS (31), /* ddiv */ - 32, + 32, /* cache line size */ 32, /* l1 cache */ 256, /* l2 cache */ 1, /* streams */ @@ -508,7 +548,7 @@ struct processor_costs ppc603_cost = { COSTS_N_INSNS (4), /* dmul */ COSTS_N_INSNS (18), /* sdiv */ COSTS_N_INSNS (33), /* ddiv */ - 32, + 32, /* cache line size */ 8, /* l1 cache */ 64, /* l2 cache */ 1, /* streams */ @@ -527,7 +567,7 @@ struct processor_costs ppc604_cost = { COSTS_N_INSNS (3), /* dmul */ COSTS_N_INSNS (18), /* sdiv */ COSTS_N_INSNS (32), /* ddiv */ - 32, + 32, /* cache line size */ 16, /* l1 cache */ 512, /* l2 cache */ 1, /* streams */ @@ -546,7 +586,7 @@ struct processor_costs ppc604e_cost = { COSTS_N_INSNS (3), /* dmul */ COSTS_N_INSNS (18), /* sdiv */ COSTS_N_INSNS (32), /* ddiv */ - 32, + 32, /* cache line size */ 32, /* l1 cache */ 1024, /* l2 cache */ 1, /* streams */ @@ -565,7 +605,7 @@ struct processor_costs ppc620_cost = { COSTS_N_INSNS (3), /* dmul */ COSTS_N_INSNS (18), /* sdiv */ COSTS_N_INSNS (32), /* ddiv */ - 128, + 128, /* cache line size */ 32, /* l1 cache */ 1024, /* l2 cache */ 1, /* streams */ @@ -584,7 +624,7 @@ struct processor_costs ppc630_cost = { COSTS_N_INSNS (3), /* dmul */ COSTS_N_INSNS (17), /* sdiv */ COSTS_N_INSNS (21), /* ddiv */ - 128, + 128, /* cache line size */ 64, /* l1 cache */ 1024, /* l2 cache */ 1, /* streams */ @@ -604,7 +644,7 @@ struct processor_costs ppccell_cost = { COSTS_N_INSNS (10/2), /* dmul */ COSTS_N_INSNS (74/2), /* sdiv */ COSTS_N_INSNS (74/2), /* ddiv */ - 128, + 128, /* cache line size */ 32, /* l1 cache */ 512, /* l2 cache */ 6, /* streams */ @@ -623,7 +663,7 @@ struct processor_costs ppc750_cost = { COSTS_N_INSNS (3), /* dmul */ COSTS_N_INSNS (17), /* sdiv */ COSTS_N_INSNS (31), /* ddiv */ - 32, + 32, /* cache line size */ 32, /* l1 cache */ 512, /* l2 cache */ 1, /* streams */ @@ -642,7 +682,7 @@ struct processor_costs ppc7450_cost = { COSTS_N_INSNS (5), /* dmul */ COSTS_N_INSNS (21), /* sdiv */ COSTS_N_INSNS (35), /* ddiv */ - 32, + 32, /* cache line size */ 32, /* l1 cache */ 1024, /* l2 cache */ 1, /* streams */ @@ -661,12 +701,50 @@ struct processor_costs ppc8540_cost = { COSTS_N_INSNS (4), /* dmul */ COSTS_N_INSNS (29), /* sdiv */ COSTS_N_INSNS (29), /* ddiv */ - 32, + 32, /* cache line size */ 32, /* l1 cache */ 256, /* l2 cache */ 1, /* prefetch streams /*/ }; +/* Instruction costs on E300C2 and E300C3 cores. */ +static const +struct processor_costs ppce300c2c3_cost = { + COSTS_N_INSNS (4), /* mulsi */ + COSTS_N_INSNS (4), /* mulsi_const */ + COSTS_N_INSNS (4), /* mulsi_const9 */ + COSTS_N_INSNS (4), /* muldi */ + COSTS_N_INSNS (19), /* divsi */ + COSTS_N_INSNS (19), /* divdi */ + COSTS_N_INSNS (3), /* fp */ + COSTS_N_INSNS (4), /* dmul */ + COSTS_N_INSNS (18), /* sdiv */ + COSTS_N_INSNS (33), /* ddiv */ + 32, + 16, /* l1 cache */ + 16, /* l2 cache */ + 1, /* prefetch streams /*/ +}; + +/* Instruction costs on PPCE500MC processors. */ +static const +struct processor_costs ppce500mc_cost = { + COSTS_N_INSNS (4), /* mulsi */ + COSTS_N_INSNS (4), /* mulsi_const */ + COSTS_N_INSNS (4), /* mulsi_const9 */ + COSTS_N_INSNS (4), /* muldi */ + COSTS_N_INSNS (14), /* divsi */ + COSTS_N_INSNS (14), /* divdi */ + COSTS_N_INSNS (8), /* fp */ + COSTS_N_INSNS (10), /* dmul */ + COSTS_N_INSNS (36), /* sdiv */ + COSTS_N_INSNS (66), /* ddiv */ + 64, /* cache line size */ + 32, /* l1 cache */ + 128, /* l2 cache */ + 1, /* prefetch streams /*/ +}; + /* Instruction costs on POWER4 and POWER5 processors. */ static const struct processor_costs power4_cost = { @@ -680,7 +758,7 @@ struct processor_costs power4_cost = { COSTS_N_INSNS (3), /* dmul */ COSTS_N_INSNS (17), /* sdiv */ COSTS_N_INSNS (17), /* ddiv */ - 128, + 128, /* cache line size */ 32, /* l1 cache */ 1024, /* l2 cache */ 8, /* prefetch streams /*/ @@ -699,34 +777,72 @@ struct processor_costs power6_cost = { COSTS_N_INSNS (3), /* dmul */ COSTS_N_INSNS (13), /* sdiv */ COSTS_N_INSNS (16), /* ddiv */ - 128, + 128, /* cache line size */ 64, /* l1 cache */ 2048, /* l2 cache */ 16, /* prefetch streams */ }; +/* Instruction costs on POWER7 processors. */ +static const +struct processor_costs power7_cost = { + COSTS_N_INSNS (2), /* mulsi */ + COSTS_N_INSNS (2), /* mulsi_const */ + COSTS_N_INSNS (2), /* mulsi_const9 */ + COSTS_N_INSNS (2), /* muldi */ + COSTS_N_INSNS (18), /* divsi */ + COSTS_N_INSNS (34), /* divdi */ + COSTS_N_INSNS (3), /* fp */ + COSTS_N_INSNS (3), /* dmul */ + COSTS_N_INSNS (13), /* sdiv */ + COSTS_N_INSNS (16), /* ddiv */ + 128, /* cache line size */ + 32, /* l1 cache */ + 256, /* l2 cache */ + 12, /* prefetch streams */ +}; + + +/* Table that classifies rs6000 builtin functions (pure, const, etc.). */ +#undef RS6000_BUILTIN +#undef RS6000_BUILTIN_EQUATE +#define RS6000_BUILTIN(NAME, TYPE) TYPE, +#define RS6000_BUILTIN_EQUATE(NAME, VALUE) + +static const enum rs6000_btc builtin_classify[(int)MAX_RS6000_BUILTINS] = +{ +#include "rs6000-builtin.def" +}; + +#undef RS6000_BUILTIN +#undef RS6000_BUILTIN_EQUATE + static bool rs6000_function_ok_for_sibcall (tree, tree); static const char *rs6000_invalid_within_doloop (const_rtx); +static bool rs6000_legitimate_address_p (enum machine_mode, rtx, bool); +static bool rs6000_debug_legitimate_address_p (enum machine_mode, rtx, bool); +bool (*rs6000_legitimate_address_ptr) (enum machine_mode, rtx, bool) + = rs6000_legitimate_address_p; static rtx rs6000_generate_compare (enum rtx_code); static void rs6000_emit_stack_tie (void); static void rs6000_frame_related (rtx, rtx, HOST_WIDE_INT, rtx, rtx); -static rtx spe_synthesize_frame_save (rtx); static bool spe_func_has_64bit_regs_p (void); static void emit_frame_save (rtx, rtx, enum machine_mode, unsigned int, int, HOST_WIDE_INT); static rtx gen_frame_mem_offset (enum machine_mode, rtx, int); -static void rs6000_emit_allocate_stack (HOST_WIDE_INT, int); +static void rs6000_emit_allocate_stack (HOST_WIDE_INT, int, int); static unsigned rs6000_hash_constant (rtx); static unsigned toc_hash_function (const void *); static int toc_hash_eq (const void *, const void *); -static int constant_pool_expr_1 (rtx, int *, int *); +static bool reg_offset_addressing_ok_p (enum machine_mode); +static bool virtual_stack_registers_memory_p (rtx); static bool constant_pool_expr_p (rtx); static bool legitimate_small_data_p (enum machine_mode, rtx); static bool legitimate_lo_sum_address_p (enum machine_mode, rtx, int); static struct machine_function * rs6000_init_machine_status (void); static bool rs6000_assemble_integer (rtx, unsigned int, int); -static bool no_global_regs_above (int); +static bool no_global_regs_above (int, bool); #ifdef HAVE_GAS_HIDDEN static void rs6000_assemble_visibility (tree, int); #endif @@ -739,7 +855,14 @@ static void rs6000_eliminate_indexed_mem static const char *rs6000_mangle_type (const_tree); extern const struct attribute_spec rs6000_attribute_table[]; static void rs6000_set_default_type_attributes (tree); +static rtx rs6000_savres_routine_sym (rs6000_stack_t *, bool, bool, bool); +static void rs6000_emit_stack_reset (rs6000_stack_t *, rtx, rtx, int, bool); +static rtx rs6000_make_savres_rtx (rs6000_stack_t *, rtx, int, + enum machine_mode, bool, bool, bool); static bool rs6000_reg_live_or_pic_offset_p (int); +static tree rs6000_builtin_vectorized_function (unsigned int, tree, tree); +static int rs6000_savres_strategy (rs6000_stack_t *, bool, int, int); +static void rs6000_restore_saved_cr (rtx, int); static void rs6000_output_function_prologue (FILE *, HOST_WIDE_INT); static void rs6000_output_function_epilogue (FILE *, HOST_WIDE_INT); static void rs6000_output_mi_thunk (FILE *, tree, HOST_WIDE_INT, HOST_WIDE_INT, @@ -779,7 +902,10 @@ static void rs6000_xcoff_file_end (void) #endif static int rs6000_variable_issue (FILE *, int, rtx, int); static bool rs6000_rtx_costs (rtx, int, int, int *); +static bool rs6000_debug_rtx_costs (rtx, int, int, int *); +static int rs6000_debug_address_cost (rtx); static int rs6000_adjust_cost (rtx, rtx, rtx, int); +static int rs6000_debug_adjust_cost (rtx, rtx, rtx, int); static void rs6000_sched_init (FILE *, int, int); static bool is_microcoded_insn (rtx); static bool is_nonpipeline_insn (rtx); @@ -811,6 +937,10 @@ static tree rs6000_builtin_mask_for_load static tree rs6000_builtin_mul_widen_even (tree); static tree rs6000_builtin_mul_widen_odd (tree); static tree rs6000_builtin_conversion (enum tree_code, tree); +static bool rs6000_builtin_support_vector_misalignment (enum + machine_mode, + const_tree, + int, bool); static void def_builtin (int, const char *, tree, int); static bool rs6000_vector_alignment_reachable (const_tree, bool); @@ -820,6 +950,11 @@ static rtx rs6000_expand_binop_builtin ( static rtx rs6000_expand_ternop_builtin (enum insn_code, tree, rtx); static rtx rs6000_expand_builtin (tree, rtx, rtx, enum machine_mode, int); static void altivec_init_builtins (void); +static unsigned builtin_hash_function (const void *); +static int builtin_hash_eq (const void *, const void *); +static tree builtin_function_type (enum machine_mode, enum machine_mode, + enum machine_mode, enum machine_mode, + enum rs6000_builtins, const char *name); static void rs6000_common_init_builtins (void); static void rs6000_init_libfuncs (void); @@ -847,8 +982,7 @@ static rtx altivec_expand_ld_builtin (tr static rtx altivec_expand_st_builtin (tree, rtx, bool *); static rtx altivec_expand_dst_builtin (tree, rtx, bool *); static rtx altivec_expand_abs_builtin (enum insn_code, tree, rtx); -static rtx altivec_expand_predicate_builtin (enum insn_code, - const char *, tree, rtx); +static rtx altivec_expand_predicate_builtin (enum insn_code, tree, rtx); static rtx altivec_expand_stv_builtin (enum insn_code, tree); static rtx altivec_expand_vec_init_builtin (tree, tree, rtx); static rtx altivec_expand_vec_set_builtin (tree); @@ -866,6 +1000,10 @@ int easy_vector_constant (rtx, enum mach static bool rs6000_is_opaque_type (const_tree); static rtx rs6000_dwarf_register_span (rtx); static void rs6000_init_dwarf_reg_sizes_extra (tree); +static rtx rs6000_legitimize_address (rtx, rtx, enum machine_mode); +static rtx rs6000_debug_legitimize_address (rtx, rtx, enum machine_mode); +rtx (*rs6000_legitimize_address_ptr) (rtx, rtx, enum machine_mode) + = rs6000_legitimize_address; static rtx rs6000_legitimize_tls_address (rtx, enum tls_model); static void rs6000_output_dwarf_dtprel (FILE *, int, rtx) ATTRIBUTE_UNUSED; static rtx rs6000_tls_get_addr (void); @@ -910,14 +1048,68 @@ static tree rs6000_gimplify_va_arg (tree static bool rs6000_must_pass_in_stack (enum machine_mode, const_tree); static bool rs6000_scalar_mode_supported_p (enum machine_mode); static bool rs6000_vector_mode_supported_p (enum machine_mode); -static int get_vec_cmp_insn (enum rtx_code, enum machine_mode, - enum machine_mode); +static rtx rs6000_emit_vector_compare_inner (enum rtx_code, rtx, rtx); static rtx rs6000_emit_vector_compare (enum rtx_code, rtx, rtx, enum machine_mode); -static int get_vsel_insn (enum machine_mode); -static void rs6000_emit_vector_select (rtx, rtx, rtx, rtx); static tree rs6000_stack_protect_fail (void); +static rtx rs6000_legitimize_reload_address (rtx, enum machine_mode, int, int, + int, int *); + +static rtx rs6000_debug_legitimize_reload_address (rtx, enum machine_mode, int, + int, int, int *); + +rtx (*rs6000_legitimize_reload_address_ptr) (rtx, enum machine_mode, int, int, + int, int *) + = rs6000_legitimize_reload_address; + +static bool rs6000_mode_dependent_address (rtx); +static bool rs6000_debug_mode_dependent_address (rtx); +bool (*rs6000_mode_dependent_address_ptr) (rtx) + = rs6000_mode_dependent_address; + +static enum reg_class rs6000_secondary_reload_class (enum reg_class, + enum machine_mode, rtx); +static enum reg_class rs6000_debug_secondary_reload_class (enum reg_class, + enum machine_mode, + rtx); +enum reg_class (*rs6000_secondary_reload_class_ptr) (enum reg_class, + enum machine_mode, rtx) + = rs6000_secondary_reload_class; + +static enum reg_class rs6000_preferred_reload_class (rtx, enum reg_class); +static enum reg_class rs6000_debug_preferred_reload_class (rtx, + enum reg_class); +enum reg_class (*rs6000_preferred_reload_class_ptr) (rtx, enum reg_class) + = rs6000_preferred_reload_class; + +static bool rs6000_secondary_memory_needed (enum reg_class, enum reg_class, + enum machine_mode); + +static bool rs6000_debug_secondary_memory_needed (enum reg_class, + enum reg_class, + enum machine_mode); + +bool (*rs6000_secondary_memory_needed_ptr) (enum reg_class, enum reg_class, + enum machine_mode) + = rs6000_secondary_memory_needed; + +static bool rs6000_cannot_change_mode_class (enum machine_mode, + enum machine_mode, + enum reg_class); +static bool rs6000_debug_cannot_change_mode_class (enum machine_mode, + enum machine_mode, + enum reg_class); + +bool (*rs6000_cannot_change_mode_class_ptr) (enum machine_mode, + enum machine_mode, + enum reg_class) + = rs6000_cannot_change_mode_class; + +static enum reg_class rs6000_secondary_reload (bool, rtx, enum reg_class, + enum machine_mode, + struct secondary_reload_info *); + const int INSN_NOT_AVAILABLE = -1; static enum machine_mode rs6000_eh_return_filter_mode (void); @@ -933,6 +1125,17 @@ struct toc_hash_struct GTY(()) }; static GTY ((param_is (struct toc_hash_struct))) htab_t toc_hash_table; + +/* Hash table to keep track of the argument types for builtin functions. */ + +struct builtin_hash_struct GTY(()) +{ + tree type; + enum machine_mode mode[4]; /* return value + 3 arguments. */ + unsigned char uns_p[4]; /* and whether the types are unsigned. */ +}; + +static GTY ((param_is (struct builtin_hash_struct))) htab_t builtin_hash_table; /* Default register names. */ char rs6000_reg_names[][8] = @@ -992,6 +1195,9 @@ static const char alt_reg_names[][8] = #endif #ifndef TARGET_PROFILE_KERNEL #define TARGET_PROFILE_KERNEL 0 +#define SET_PROFILE_KERNEL(N) +#else +#define SET_PROFILE_KERNEL(N) TARGET_PROFILE_KERNEL = (N) #endif /* The VRSAVE bitmask puts bit %v0 as the most significant bit. */ @@ -1090,6 +1296,10 @@ static const char alt_reg_names[][8] = #undef TARGET_VECTOR_ALIGNMENT_REACHABLE #define TARGET_VECTOR_ALIGNMENT_REACHABLE rs6000_vector_alignment_reachable +#undef TARGET_SUPPORT_VECTOR_MISALIGNMENT +#define TARGET_SUPPORT_VECTOR_MISALIGNMENT \ + rs6000_builtin_support_vector_misalignment + #undef TARGET_INIT_BUILTINS #define TARGET_INIT_BUILTINS rs6000_init_builtins @@ -1187,6 +1397,10 @@ static const char alt_reg_names[][8] = #undef TARGET_HANDLE_OPTION #define TARGET_HANDLE_OPTION rs6000_handle_option +#undef TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION +#define TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION \ + rs6000_builtin_vectorized_function + #undef TARGET_DEFAULT_TARGET_FLAGS #define TARGET_DEFAULT_TARGET_FLAGS \ (TARGET_DEFAULT) @@ -1233,30 +1447,98 @@ static const char alt_reg_names[][8] = #undef TARGET_INSTANTIATE_DECLS #define TARGET_INSTANTIATE_DECLS rs6000_instantiate_decls +#undef TARGET_SECONDARY_RELOAD +#define TARGET_SECONDARY_RELOAD rs6000_secondary_reload + struct gcc_target targetm = TARGET_INITIALIZER; +/* Return number of consecutive hard regs needed starting at reg REGNO + to hold something of mode MODE. + This is ordinarily the length in words of a value of mode MODE + but can be less for certain modes in special long registers. + + For the SPE, GPRs are 64 bits but only 32 bits are visible in + scalar instructions. The upper 32 bits are only available to the + SIMD instructions. + + POWER and PowerPC GPRs hold 32 bits worth; + PowerPC64 GPRs and FPRs point register holds 64 bits worth. */ + +static int +rs6000_hard_regno_nregs_internal (int regno, enum machine_mode mode) +{ + unsigned HOST_WIDE_INT reg_size; + + if (FP_REGNO_P (regno)) + reg_size = (VECTOR_MEM_VSX_P (mode) + ? UNITS_PER_VSX_WORD + : UNITS_PER_FP_WORD); + + else if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode)) + reg_size = UNITS_PER_SPE_WORD; + + else if (ALTIVEC_REGNO_P (regno)) + reg_size = UNITS_PER_ALTIVEC_WORD; + + /* The value returned for SCmode in the E500 double case is 2 for + ABI compatibility; storing an SCmode value in a single register + would require function_arg and rs6000_spe_function_arg to handle + SCmode so as to pass the value correctly in a pair of + registers. */ + else if (TARGET_E500_DOUBLE && FLOAT_MODE_P (mode) && mode != SCmode + && !DECIMAL_FLOAT_MODE_P (mode)) + reg_size = UNITS_PER_FP_WORD; + + else + reg_size = UNITS_PER_WORD; + + return (GET_MODE_SIZE (mode) + reg_size - 1) / reg_size; +} /* Value is 1 if hard register REGNO can hold a value of machine-mode MODE. */ static int rs6000_hard_regno_mode_ok (int regno, enum machine_mode mode) { + int last_regno = regno + rs6000_hard_regno_nregs[mode][regno] - 1; + + /* VSX registers that overlap the FPR registers are larger than for non-VSX + implementations. Don't allow an item to be split between a FP register + and an Altivec register. */ + if (VECTOR_MEM_VSX_P (mode)) + { + if (FP_REGNO_P (regno)) + return FP_REGNO_P (last_regno); + + if (ALTIVEC_REGNO_P (regno)) + return ALTIVEC_REGNO_P (last_regno); + } + /* The GPRs can hold any mode, but values bigger than one register cannot go past R31. */ if (INT_REGNO_P (regno)) - return INT_REGNO_P (regno + HARD_REGNO_NREGS (regno, mode) - 1); + return INT_REGNO_P (last_regno); - /* The float registers can only hold floating modes and DImode. - This excludes the 32-bit decimal float mode for now. */ + /* The float registers (except for VSX vector modes) can only hold floating + modes and DImode. This excludes the 32-bit decimal float mode for + now. */ if (FP_REGNO_P (regno)) - return - ((SCALAR_FLOAT_MODE_P (mode) - && (mode != TDmode || (regno % 2) == 0) - && FP_REGNO_P (regno + HARD_REGNO_NREGS (regno, mode) - 1)) - || (GET_MODE_CLASS (mode) == MODE_INT + { + if (SCALAR_FLOAT_MODE_P (mode) + && (mode != TDmode || (regno % 2) == 0) + && FP_REGNO_P (last_regno)) + return 1; + + if (GET_MODE_CLASS (mode) == MODE_INT && GET_MODE_SIZE (mode) == UNITS_PER_FP_WORD) - || (PAIRED_SIMD_REGNO_P (regno) && TARGET_PAIRED_FLOAT - && PAIRED_VECTOR_MODE (mode))); + return 1; + + if (PAIRED_SIMD_REGNO_P (regno) && TARGET_PAIRED_FLOAT + && PAIRED_VECTOR_MODE (mode)) + return 1; + + return 0; + } /* The CR register can only hold CC modes. */ if (CR_REGNO_P (regno)) @@ -1267,123 +1549,542 @@ rs6000_hard_regno_mode_ok (int regno, en /* AltiVec only in AldyVec registers. */ if (ALTIVEC_REGNO_P (regno)) - return ALTIVEC_VECTOR_MODE (mode); + return VECTOR_MEM_ALTIVEC_OR_VSX_P (mode); /* ...but GPRs can hold SIMD data on the SPE in one register. */ if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode)) return 1; - /* We cannot put TImode anywhere except general register and it must be - able to fit within the register set. */ + /* We cannot put TImode anywhere except general register and it must be able + to fit within the register set. In the future, allow TImode in the + Altivec or VSX registers. */ return GET_MODE_SIZE (mode) <= UNITS_PER_WORD; } -/* Initialize rs6000_hard_regno_mode_ok_p table. */ +/* Print interesting facts about registers. */ static void -rs6000_init_hard_regno_mode_ok (void) +rs6000_debug_reg_print (int first_regno, int last_regno, const char *reg_name) { int r, m; - for (r = 0; r < FIRST_PSEUDO_REGISTER; ++r) - for (m = 0; m < NUM_MACHINE_MODES; ++m) - if (rs6000_hard_regno_mode_ok (r, m)) - rs6000_hard_regno_mode_ok_p[m][r] = true; -} + for (r = first_regno; r <= last_regno; ++r) + { + const char *comma = ""; + int len; -#if TARGET_MACHO -/* The Darwin version of SUBTARGET_OVERRIDE_OPTIONS. */ + if (first_regno == last_regno) + fprintf (stderr, "%s:\t", reg_name); + else + fprintf (stderr, "%s%d:\t", reg_name, r - first_regno); + + len = 8; + for (m = 0; m < NUM_MACHINE_MODES; ++m) + if (rs6000_hard_regno_mode_ok_p[m][r] && rs6000_hard_regno_nregs[m][r]) + { + if (len > 70) + { + fprintf (stderr, ",\n\t"); + len = 8; + comma = ""; + } + + if (rs6000_hard_regno_nregs[m][r] > 1) + len += fprintf (stderr, "%s%s/%d", comma, GET_MODE_NAME (m), + rs6000_hard_regno_nregs[m][r]); + else + len += fprintf (stderr, "%s%s", comma, GET_MODE_NAME (m)); + + comma = ", "; + } + + if (call_used_regs[r]) + { + if (len > 70) + { + fprintf (stderr, ",\n\t"); + len = 8; + comma = ""; + } + len += fprintf (stderr, "%s%s", comma, "call-used"); + comma = ", "; + } + + if (fixed_regs[r]) + { + if (len > 70) + { + fprintf (stderr, ",\n\t"); + len = 8; + comma = ""; + } + + len += fprintf (stderr, "%s%s", comma, "fixed"); + comma = ", "; + } + + if (len > 70) + { + fprintf (stderr, ",\n\t"); + comma = ""; + } + + fprintf (stderr, "%sregno = %d\n", comma, r); + } +} + +/* Print various interesting information with -mdebug=reg. */ static void -darwin_rs6000_override_options (void) +rs6000_debug_reg_global (void) { - /* The Darwin ABI always includes AltiVec, can't be (validly) turned - off. */ - rs6000_altivec_abi = 1; - TARGET_ALTIVEC_VRSAVE = 1; - if (DEFAULT_ABI == ABI_DARWIN) - { - if (MACHO_DYNAMIC_NO_PIC_P) - { - if (flag_pic) - warning (0, "-mdynamic-no-pic overrides -fpic or -fPIC"); - flag_pic = 0; - } - else if (flag_pic == 1) + const char *nl = (const char *)0; + int m; + char costly_num[20]; + char nop_num[20]; + const char *costly_str; + const char *nop_str; + + /* Map enum rs6000_vector to string. */ + static const char *rs6000_debug_vector_unit[] = { + "none", + "altivec", + "vsx", + "paired", + "spe", + "other" + }; + + fprintf (stderr, "Register information: (last virtual reg = %d)\n", + LAST_VIRTUAL_REGISTER); + rs6000_debug_reg_print (0, 31, "gr"); + rs6000_debug_reg_print (32, 63, "fp"); + rs6000_debug_reg_print (FIRST_ALTIVEC_REGNO, + LAST_ALTIVEC_REGNO, + "vs"); + rs6000_debug_reg_print (LR_REGNO, LR_REGNO, "lr"); + rs6000_debug_reg_print (CTR_REGNO, CTR_REGNO, "ctr"); + rs6000_debug_reg_print (CR0_REGNO, CR7_REGNO, "cr"); + rs6000_debug_reg_print (MQ_REGNO, MQ_REGNO, "mq"); + rs6000_debug_reg_print (XER_REGNO, XER_REGNO, "xer"); + rs6000_debug_reg_print (VRSAVE_REGNO, VRSAVE_REGNO, "vrsave"); + rs6000_debug_reg_print (VSCR_REGNO, VSCR_REGNO, "vscr"); + rs6000_debug_reg_print (SPE_ACC_REGNO, SPE_ACC_REGNO, "spe_a"); + rs6000_debug_reg_print (SPEFSCR_REGNO, SPEFSCR_REGNO, "spe_f"); + + fprintf (stderr, + "\n" + "d reg_class = %s\n" + "f reg_class = %s\n" + "v reg_class = %s\n" + "wa reg_class = %s\n" + "wd reg_class = %s\n" + "wf reg_class = %s\n" + "ws reg_class = %s\n\n", + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_d]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_f]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_v]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wa]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wd]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wf]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ws]]); + + for (m = 0; m < NUM_MACHINE_MODES; ++m) + if (rs6000_vector_unit[m] || rs6000_vector_mem[m]) { - flag_pic = 2; + nl = "\n"; + fprintf (stderr, "Vector mode: %-5s arithmetic: %-8s move: %-8s\n", + GET_MODE_NAME (m), + rs6000_debug_vector_unit[ rs6000_vector_unit[m] ], + rs6000_debug_vector_unit[ rs6000_vector_mem[m] ]); } - } - if (TARGET_64BIT && ! TARGET_POWERPC64) + + if (nl) + fputs (nl, stderr); + + switch (rs6000_sched_costly_dep) { - target_flags |= MASK_POWERPC64; - warning (0, "-m64 requires PowerPC64 architecture, enabling"); + case max_dep_latency: + costly_str = "max_dep_latency"; + break; + + case no_dep_costly: + costly_str = "no_dep_costly"; + break; + + case all_deps_costly: + costly_str = "all_deps_costly"; + break; + + case true_store_to_load_dep_costly: + costly_str = "true_store_to_load_dep_costly"; + break; + + case store_to_load_dep_costly: + costly_str = "store_to_load_dep_costly"; + break; + + default: + costly_str = costly_num; + sprintf (costly_num, "%d", (int)rs6000_sched_costly_dep); + break; } - if (flag_mkernel) + + switch (rs6000_sched_insert_nops) { - rs6000_default_long_calls = 1; - target_flags |= MASK_SOFT_FLOAT; - } + case sched_finish_regroup_exact: + nop_str = "sched_finish_regroup_exact"; + break; - /* Make -m64 imply -maltivec. Darwin's 64-bit ABI includes - Altivec. */ - if (!flag_mkernel && !flag_apple_kext - && TARGET_64BIT - && ! (target_flags_explicit & MASK_ALTIVEC)) - target_flags |= MASK_ALTIVEC; + case sched_finish_pad_groups: + nop_str = "sched_finish_pad_groups"; + break; - /* Unless the user (not the configurer) has explicitly overridden - it with -mcpu=G3 or -mno-altivec, then 10.5+ targets default to - G4 unless targetting the kernel. */ - if (!flag_mkernel - && !flag_apple_kext - && strverscmp (darwin_macosx_version_min, "10.5") >= 0 - && ! (target_flags_explicit & MASK_ALTIVEC) - && ! rs6000_select[1].string) - { - target_flags |= MASK_ALTIVEC; + case sched_finish_none: + nop_str = "sched_finish_none"; + break; + + default: + nop_str = nop_num; + sprintf (nop_num, "%d", (int)rs6000_sched_insert_nops); + break; } + + fprintf (stderr, + "always_hint = %s\n" + "align_branch_targets = %s\n" + "sched_restricted_insns_priority = %d\n" + "sched_costly_dep = %s\n" + "sched_insert_nops = %s\n\n", + rs6000_always_hint ? "true" : "false", + rs6000_align_branch_targets ? "true" : "false", + (int)rs6000_sched_restricted_insns_priority, + costly_str, nop_str); } -#endif -/* If not otherwise specified by a target, make 'long double' equivalent to - 'double'. */ +/* Initialize the various global tables that are based on register size. */ +static void +rs6000_init_hard_regno_mode_ok (void) +{ + int r, m, c; + int align64; + int align32; -#ifndef RS6000_DEFAULT_LONG_DOUBLE_SIZE -#define RS6000_DEFAULT_LONG_DOUBLE_SIZE 64 -#endif + /* Precalculate REGNO_REG_CLASS. */ + rs6000_regno_regclass[0] = GENERAL_REGS; + for (r = 1; r < 32; ++r) + rs6000_regno_regclass[r] = BASE_REGS; -/* Override command line options. Mostly we process the processor - type and sometimes adjust other TARGET_ options. */ + for (r = 32; r < 64; ++r) + rs6000_regno_regclass[r] = FLOAT_REGS; -void -rs6000_override_options (const char *default_cpu) -{ - size_t i, j; - struct rs6000_cpu_select *ptr; - int set_masks; + for (r = 64; r < FIRST_PSEUDO_REGISTER; ++r) + rs6000_regno_regclass[r] = NO_REGS; - /* Simplifications for entries below. */ + for (r = FIRST_ALTIVEC_REGNO; r <= LAST_ALTIVEC_REGNO; ++r) + rs6000_regno_regclass[r] = ALTIVEC_REGS; - enum { - POWERPC_BASE_MASK = MASK_POWERPC | MASK_NEW_MNEMONICS, - POWERPC_7400_MASK = POWERPC_BASE_MASK | MASK_PPC_GFXOPT | MASK_ALTIVEC - }; + rs6000_regno_regclass[CR0_REGNO] = CR0_REGS; + for (r = CR1_REGNO; r <= CR7_REGNO; ++r) + rs6000_regno_regclass[r] = CR_REGS; - /* This table occasionally claims that a processor does not support - a particular feature even though it does, but the feature is slower - than the alternative. Thus, it shouldn't be relied on as a - complete description of the processor's support. + rs6000_regno_regclass[MQ_REGNO] = MQ_REGS; + rs6000_regno_regclass[LR_REGNO] = LINK_REGS; + rs6000_regno_regclass[CTR_REGNO] = CTR_REGS; + rs6000_regno_regclass[XER_REGNO] = XER_REGS; + rs6000_regno_regclass[VRSAVE_REGNO] = VRSAVE_REGS; + rs6000_regno_regclass[VSCR_REGNO] = VRSAVE_REGS; + rs6000_regno_regclass[SPE_ACC_REGNO] = SPE_ACC_REGS; + rs6000_regno_regclass[SPEFSCR_REGNO] = SPEFSCR_REGS; + rs6000_regno_regclass[ARG_POINTER_REGNUM] = BASE_REGS; + rs6000_regno_regclass[FRAME_POINTER_REGNUM] = BASE_REGS; - Please keep this list in order, and don't forget to update the - documentation in invoke.texi when adding a new processor or - flag. */ - static struct ptt + /* Precalculate vector information, this must be set up before the + rs6000_hard_regno_nregs_internal below. */ + for (m = 0; m < NUM_MACHINE_MODES; ++m) { - const char *const name; /* Canonical processor name. */ - const enum processor_type processor; /* Processor type enum value. */ - const int target_enable; /* Target flags to enable. */ - } const processor_target_table[] + rs6000_vector_unit[m] = rs6000_vector_mem[m] = VECTOR_NONE; + rs6000_vector_reload[m][0] = CODE_FOR_nothing; + rs6000_vector_reload[m][1] = CODE_FOR_nothing; + } + + for (c = 0; c < (int)(int)RS6000_CONSTRAINT_MAX; c++) + rs6000_constraints[c] = NO_REGS; + + /* The VSX hardware allows native alignment for vectors, but control whether the compiler + believes it can use native alignment or still uses 128-bit alignment. */ + if (TARGET_VSX && !TARGET_VSX_ALIGN_128) + { + align64 = 64; + align32 = 32; + } + else + { + align64 = 128; + align32 = 128; + } + + /* V2DF mode, VSX only. */ + if (TARGET_VSX) + { + rs6000_vector_unit[V2DFmode] = VECTOR_VSX; + rs6000_vector_mem[V2DFmode] = VECTOR_VSX; + rs6000_vector_align[V2DFmode] = align64; + } + + /* V4SF mode, either VSX or Altivec. */ + if (TARGET_VSX) + { + rs6000_vector_unit[V4SFmode] = VECTOR_VSX; + rs6000_vector_mem[V4SFmode] = VECTOR_VSX; + rs6000_vector_align[V4SFmode] = align32; + } + else if (TARGET_ALTIVEC) + { + rs6000_vector_unit[V4SFmode] = VECTOR_ALTIVEC; + rs6000_vector_mem[V4SFmode] = VECTOR_ALTIVEC; + rs6000_vector_align[V4SFmode] = align32; + } + + /* V16QImode, V8HImode, V4SImode are Altivec only, but possibly do VSX loads + and stores. */ + if (TARGET_ALTIVEC) + { + rs6000_vector_unit[V4SImode] = VECTOR_ALTIVEC; + rs6000_vector_unit[V8HImode] = VECTOR_ALTIVEC; + rs6000_vector_unit[V16QImode] = VECTOR_ALTIVEC; + rs6000_vector_align[V4SImode] = align32; + rs6000_vector_align[V8HImode] = align32; + rs6000_vector_align[V16QImode] = align32; + + if (TARGET_VSX) + { + rs6000_vector_mem[V4SImode] = VECTOR_VSX; + rs6000_vector_mem[V8HImode] = VECTOR_VSX; + rs6000_vector_mem[V16QImode] = VECTOR_VSX; + } + else + { + rs6000_vector_mem[V4SImode] = VECTOR_ALTIVEC; + rs6000_vector_mem[V8HImode] = VECTOR_ALTIVEC; + rs6000_vector_mem[V16QImode] = VECTOR_ALTIVEC; + } + } + + /* V2DImode, only allow under VSX, which can do V2DI insert/splat/extract. + Altivec doesn't have 64-bit support. */ + if (TARGET_VSX) + { + rs6000_vector_mem[V2DImode] = VECTOR_VSX; + rs6000_vector_unit[V2DImode] = VECTOR_NONE; + rs6000_vector_align[V2DImode] = align64; + } + + /* DFmode, see if we want to use the VSX unit. */ + if (TARGET_VSX && TARGET_VSX_SCALAR_DOUBLE) + { + rs6000_vector_unit[DFmode] = VECTOR_VSX; + rs6000_vector_mem[DFmode] + = (TARGET_VSX_SCALAR_MEMORY ? VECTOR_VSX : VECTOR_NONE); + rs6000_vector_align[DFmode] = align64; + } + + /* TODO add SPE and paired floating point vector support. */ + + /* Register class constaints for the constraints that depend on compile + switches. */ + if (TARGET_HARD_FLOAT && TARGET_FPRS) + rs6000_constraints[RS6000_CONSTRAINT_f] = FLOAT_REGS; + + if (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT) + rs6000_constraints[RS6000_CONSTRAINT_d] = FLOAT_REGS; + + if (TARGET_VSX) + { + /* At present, we just use VSX_REGS, but we have different constraints + based on the use, in case we want to fine tune the default register + class used. wa = any VSX register, wf = register class to use for + V4SF, wd = register class to use for V2DF, and ws = register classs to + use for DF scalars. */ + rs6000_constraints[RS6000_CONSTRAINT_wa] = VSX_REGS; + rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS; + rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS; + if (TARGET_VSX_SCALAR_DOUBLE) + rs6000_constraints[RS6000_CONSTRAINT_ws] = VSX_REGS; + } + + if (TARGET_ALTIVEC) + rs6000_constraints[RS6000_CONSTRAINT_v] = ALTIVEC_REGS; + + /* Set up the reload helper functions. */ + if (TARGET_VSX || TARGET_ALTIVEC) + { + if (TARGET_64BIT) + { + rs6000_vector_reload[V16QImode][0] = CODE_FOR_reload_v16qi_di_store; + rs6000_vector_reload[V16QImode][1] = CODE_FOR_reload_v16qi_di_load; + rs6000_vector_reload[V8HImode][0] = CODE_FOR_reload_v8hi_di_store; + rs6000_vector_reload[V8HImode][1] = CODE_FOR_reload_v8hi_di_load; + rs6000_vector_reload[V4SImode][0] = CODE_FOR_reload_v4si_di_store; + rs6000_vector_reload[V4SImode][1] = CODE_FOR_reload_v4si_di_load; + rs6000_vector_reload[V2DImode][0] = CODE_FOR_reload_v2di_di_store; + rs6000_vector_reload[V2DImode][1] = CODE_FOR_reload_v2di_di_load; + rs6000_vector_reload[V4SFmode][0] = CODE_FOR_reload_v4sf_di_store; + rs6000_vector_reload[V4SFmode][1] = CODE_FOR_reload_v4sf_di_load; + rs6000_vector_reload[V2DFmode][0] = CODE_FOR_reload_v2df_di_store; + rs6000_vector_reload[V2DFmode][1] = CODE_FOR_reload_v2df_di_load; + } + else + { + rs6000_vector_reload[V16QImode][0] = CODE_FOR_reload_v16qi_si_store; + rs6000_vector_reload[V16QImode][1] = CODE_FOR_reload_v16qi_si_load; + rs6000_vector_reload[V8HImode][0] = CODE_FOR_reload_v8hi_si_store; + rs6000_vector_reload[V8HImode][1] = CODE_FOR_reload_v8hi_si_load; + rs6000_vector_reload[V4SImode][0] = CODE_FOR_reload_v4si_si_store; + rs6000_vector_reload[V4SImode][1] = CODE_FOR_reload_v4si_si_load; + rs6000_vector_reload[V2DImode][0] = CODE_FOR_reload_v2di_si_store; + rs6000_vector_reload[V2DImode][1] = CODE_FOR_reload_v2di_si_load; + rs6000_vector_reload[V4SFmode][0] = CODE_FOR_reload_v4sf_si_store; + rs6000_vector_reload[V4SFmode][1] = CODE_FOR_reload_v4sf_si_load; + rs6000_vector_reload[V2DFmode][0] = CODE_FOR_reload_v2df_si_store; + rs6000_vector_reload[V2DFmode][1] = CODE_FOR_reload_v2df_si_load; + } + } + + /* Precalculate HARD_REGNO_NREGS. */ + for (r = 0; r < FIRST_PSEUDO_REGISTER; ++r) + for (m = 0; m < NUM_MACHINE_MODES; ++m) + rs6000_hard_regno_nregs[m][r] + = rs6000_hard_regno_nregs_internal (r, (enum machine_mode)m); + + /* Precalculate HARD_REGNO_MODE_OK. */ + for (r = 0; r < FIRST_PSEUDO_REGISTER; ++r) + for (m = 0; m < NUM_MACHINE_MODES; ++m) + if (rs6000_hard_regno_mode_ok (r, (enum machine_mode)m)) + rs6000_hard_regno_mode_ok_p[m][r] = true; + + /* Precalculate CLASS_MAX_NREGS sizes. */ + for (c = 0; c < LIM_REG_CLASSES; ++c) + { + int reg_size; + + if (TARGET_VSX && VSX_REG_CLASS_P (c)) + reg_size = UNITS_PER_VSX_WORD; + + else if (c == ALTIVEC_REGS) + reg_size = UNITS_PER_ALTIVEC_WORD; + + else if (c == FLOAT_REGS) + reg_size = UNITS_PER_FP_WORD; + + else + reg_size = UNITS_PER_WORD; + + for (m = 0; m < NUM_MACHINE_MODES; ++m) + rs6000_class_max_nregs[m][c] + = (GET_MODE_SIZE (m) + reg_size - 1) / reg_size; + } + + if (TARGET_E500_DOUBLE) + rs6000_class_max_nregs[DFmode][GENERAL_REGS] = 1; + + if (TARGET_DEBUG_REG) + rs6000_debug_reg_global (); +} + +#if TARGET_MACHO +/* The Darwin version of SUBTARGET_OVERRIDE_OPTIONS. */ + +static void +darwin_rs6000_override_options (void) +{ + /* The Darwin ABI always includes AltiVec, can't be (validly) turned + off. */ + rs6000_altivec_abi = 1; + TARGET_ALTIVEC_VRSAVE = 1; + if (DEFAULT_ABI == ABI_DARWIN) + { + if (MACHO_DYNAMIC_NO_PIC_P) + { + if (flag_pic) + warning (0, "-mdynamic-no-pic overrides -fpic or -fPIC"); + flag_pic = 0; + } + else if (flag_pic == 1) + { + flag_pic = 2; + } + } + if (TARGET_64BIT && ! TARGET_POWERPC64) + { + target_flags |= MASK_POWERPC64; + warning (0, "-m64 requires PowerPC64 architecture, enabling"); + } + if (flag_mkernel) + { + rs6000_default_long_calls = 1; + target_flags |= MASK_SOFT_FLOAT; + } + + /* Make -m64 imply -maltivec. Darwin's 64-bit ABI includes + Altivec. */ + if (!flag_mkernel && !flag_apple_kext + && TARGET_64BIT + && ! (target_flags_explicit & MASK_ALTIVEC)) + target_flags |= MASK_ALTIVEC; + + /* Unless the user (not the configurer) has explicitly overridden + it with -mcpu=G3 or -mno-altivec, then 10.5+ targets default to + G4 unless targetting the kernel. */ + if (!flag_mkernel + && !flag_apple_kext + && strverscmp (darwin_macosx_version_min, "10.5") >= 0 + && ! (target_flags_explicit & MASK_ALTIVEC) + && ! rs6000_select[1].string) + { + target_flags |= MASK_ALTIVEC; + } +} +#endif + +/* If not otherwise specified by a target, make 'long double' equivalent to + 'double'. */ + +#ifndef RS6000_DEFAULT_LONG_DOUBLE_SIZE +#define RS6000_DEFAULT_LONG_DOUBLE_SIZE 64 +#endif + +/* Override command line options. Mostly we process the processor + type and sometimes adjust other TARGET_ options. */ + +void +rs6000_override_options (const char *default_cpu) +{ + size_t i, j; + struct rs6000_cpu_select *ptr; + int set_masks; + + /* Simplifications for entries below. */ + + enum { + POWERPC_BASE_MASK = MASK_POWERPC | MASK_NEW_MNEMONICS, + POWERPC_7400_MASK = POWERPC_BASE_MASK | MASK_PPC_GFXOPT | MASK_ALTIVEC + }; + + /* This table occasionally claims that a processor does not support + a particular feature even though it does, but the feature is slower + than the alternative. Thus, it shouldn't be relied on as a + complete description of the processor's support. + + Please keep this list in order, and don't forget to update the + documentation in invoke.texi when adding a new processor or + flag. */ + static struct ptt + { + const char *const name; /* Canonical processor name. */ + const enum processor_type processor; /* Processor type enum value. */ + const int target_enable; /* Target flags to enable. */ + } const processor_target_table[] = {{"401", PROCESSOR_PPC403, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, {"403", PROCESSOR_PPC403, POWERPC_BASE_MASK | MASK_SOFT_FLOAT | MASK_STRICT_ALIGN}, @@ -1395,6 +2096,10 @@ rs6000_override_options (const char *def POWERPC_BASE_MASK | MASK_SOFT_FLOAT | MASK_MULHW | MASK_DLMZB}, {"440fp", PROCESSOR_PPC440, POWERPC_BASE_MASK | MASK_MULHW | MASK_DLMZB}, + {"464", PROCESSOR_PPC440, + POWERPC_BASE_MASK | MASK_SOFT_FLOAT | MASK_MULHW | MASK_DLMZB}, + {"464fp", PROCESSOR_PPC440, + POWERPC_BASE_MASK | MASK_MULHW | MASK_DLMZB}, {"505", PROCESSOR_MPCCORE, POWERPC_BASE_MASK}, {"601", PROCESSOR_PPC601, MASK_POWER | POWERPC_BASE_MASK | MASK_MULTIPLE | MASK_STRING}, @@ -1414,9 +2119,15 @@ rs6000_override_options (const char *def {"801", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, {"821", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, {"823", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, - {"8540", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN}, + {"8540", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN + | MASK_ISEL}, /* 8548 has a dummy entry for now. */ - {"8548", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN}, + {"8548", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN + | MASK_ISEL}, + {"e300c2", PROCESSOR_PPCE300C2, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, + {"e300c3", PROCESSOR_PPCE300C3, POWERPC_BASE_MASK}, + {"e500mc", PROCESSOR_PPCE500MC, POWERPC_BASE_MASK | MASK_PPC_GFXOPT + | MASK_ISEL}, {"860", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, {"970", PROCESSOR_POWER4, POWERPC_7400_MASK | MASK_PPC_GPOPT | MASK_MFCRF | MASK_POWERPC64}, @@ -1443,14 +2154,16 @@ rs6000_override_options (const char *def POWERPC_BASE_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_PPC_GFXOPT | MASK_MFCRF | MASK_POPCNTB | MASK_FPRND}, {"power6", PROCESSOR_POWER6, - POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF - | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP}, + POWERPC_BASE_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_PPC_GFXOPT + | MASK_MFCRF | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP}, {"power6x", PROCESSOR_POWER6, + POWERPC_BASE_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_PPC_GFXOPT + | MASK_MFCRF | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP + | MASK_MFPGPR}, + {"power7", PROCESSOR_POWER7, POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF - | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_MFPGPR}, - {"power7", PROCESSOR_POWER5, - POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF - | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP}, + | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD + | MASK_VSX}, /* Don't add MASK_ISEL by default */ {"powerpc", PROCESSOR_POWERPC, POWERPC_BASE_MASK}, {"powerpc64", PROCESSOR_POWERPC64, POWERPC_BASE_MASK | MASK_PPC_GFXOPT | MASK_POWERPC64}, @@ -1477,10 +2190,21 @@ rs6000_override_options (const char *def POWERPC_MASKS = (POWERPC_BASE_MASK | MASK_PPC_GPOPT | MASK_STRICT_ALIGN | MASK_PPC_GFXOPT | MASK_POWERPC64 | MASK_ALTIVEC | MASK_MFCRF | MASK_POPCNTB | MASK_FPRND | MASK_MULHW - | MASK_DLMZB | MASK_CMPB | MASK_MFPGPR | MASK_DFP) + | MASK_DLMZB | MASK_CMPB | MASK_MFPGPR | MASK_DFP + | MASK_POPCNTD | MASK_VSX | MASK_ISEL) }; - rs6000_init_hard_regno_mode_ok (); + /* Set the pointer size. */ + if (TARGET_64BIT) + { + rs6000_pmode = (int)DImode; + rs6000_pointer_size = 64; + } + else + { + rs6000_pmode = (int)SImode; + rs6000_pointer_size = 32; + } set_masks = POWER_MASKS | POWERPC_MASKS | MASK_SOFT_FLOAT; #ifdef OS_MISSING_POWERPC64 @@ -1524,10 +2248,16 @@ rs6000_override_options (const char *def } } - if (TARGET_E500) - rs6000_isel = 1; + if (rs6000_cpu == PROCESSOR_PPCE300C2 || rs6000_cpu == PROCESSOR_PPCE300C3 + || rs6000_cpu == PROCESSOR_PPCE500MC) + { + if (TARGET_ALTIVEC) + error ("AltiVec not supported in this target"); + if (TARGET_SPE) + error ("Spe not supported in this target"); + } - /* Disable cell micro code if we are optimizing for the cell + /* Disable Cell microcode if we are optimizing for the Cell and not optimizing for size. */ if (rs6000_gen_cell_microcode == -1) rs6000_gen_cell_microcode = !(rs6000_cpu == PROCESSOR_CELL @@ -1562,17 +2292,85 @@ rs6000_override_options (const char *def } } + /* Add some warnings for VSX. Enable -maltivec unless the user explicitly + used -mno-altivec */ + if (TARGET_VSX) + { + const char *msg = NULL; + if (!TARGET_HARD_FLOAT || !TARGET_FPRS + || !TARGET_SINGLE_FLOAT || !TARGET_DOUBLE_FLOAT) + { + if (target_flags_explicit & MASK_VSX) + msg = N_("-mvsx requires hardware floating point"); + else + target_flags &= ~ MASK_VSX; + } + else if (TARGET_PAIRED_FLOAT) + msg = N_("-mvsx and -mpaired are incompatible"); + /* The hardware will allow VSX and little endian, but until we make sure + things like vector select, etc. work don't allow VSX on little endian + systems at this point. */ + else if (!BYTES_BIG_ENDIAN) + msg = N_("-mvsx used with little endian code"); + else if (TARGET_AVOID_XFORM > 0) + msg = N_("-mvsx needs indexed addressing"); + + if (msg) + { + warning (0, msg); + target_flags &= ~ MASK_VSX; + } + else if (TARGET_VSX && !TARGET_ALTIVEC + && (target_flags_explicit & MASK_ALTIVEC) == 0) + target_flags |= MASK_ALTIVEC; + } + /* Set debug flags */ if (rs6000_debug_name) { if (! strcmp (rs6000_debug_name, "all")) - rs6000_debug_stack = rs6000_debug_arg = 1; + rs6000_debug_stack = rs6000_debug_arg = rs6000_debug_reg + = rs6000_debug_addr = rs6000_debug_cost = 1; else if (! strcmp (rs6000_debug_name, "stack")) rs6000_debug_stack = 1; else if (! strcmp (rs6000_debug_name, "arg")) rs6000_debug_arg = 1; + else if (! strcmp (rs6000_debug_name, "reg")) + rs6000_debug_reg = 1; + else if (! strcmp (rs6000_debug_name, "addr")) + rs6000_debug_addr = 1; + else if (! strcmp (rs6000_debug_name, "cost")) + rs6000_debug_cost = 1; else error ("unknown -mdebug-%s switch", rs6000_debug_name); + + /* If the appropriate debug option is enabled, replace the target hooks + with debug versions that call the real version and then prints + debugging information. */ + if (TARGET_DEBUG_COST) + { + targetm.rtx_costs = rs6000_debug_rtx_costs; + targetm.address_cost = rs6000_debug_address_cost; + targetm.sched.adjust_cost = rs6000_debug_adjust_cost; + } + + if (TARGET_DEBUG_ADDR) + { + rs6000_legitimate_address_ptr = rs6000_debug_legitimate_address_p; + rs6000_legitimize_address_ptr = rs6000_debug_legitimize_address; + rs6000_secondary_reload_class_ptr + = rs6000_debug_secondary_reload_class; + rs6000_secondary_memory_needed_ptr + = rs6000_debug_secondary_memory_needed; + rs6000_cannot_change_mode_class_ptr + = rs6000_debug_cannot_change_mode_class; + rs6000_preferred_reload_class_ptr + = rs6000_debug_preferred_reload_class; + rs6000_legitimize_reload_address_ptr + = rs6000_debug_legitimize_reload_address; + rs6000_mode_dependent_address_ptr + = rs6000_debug_mode_dependent_address; + } } if (rs6000_traceback_name) @@ -1597,7 +2395,7 @@ rs6000_override_options (const char *def #endif /* Enable Altivec ABI for AIX -maltivec. */ - if (TARGET_XCOFF && TARGET_ALTIVEC) + if (TARGET_XCOFF && (TARGET_ALTIVEC || TARGET_VSX)) rs6000_altivec_abi = 1; /* The AltiVec ABI is the default for PowerPC-64 GNU/Linux. For @@ -1606,7 +2404,7 @@ rs6000_override_options (const char *def if (TARGET_ELF) { if (!rs6000_explicit_options.altivec_abi - && (TARGET_64BIT || TARGET_ALTIVEC)) + && (TARGET_64BIT || TARGET_ALTIVEC || TARGET_VSX)) rs6000_altivec_abi = 1; /* Enable VRSAVE for AltiVec ABI, unless explicitly overridden. */ @@ -1643,9 +2441,9 @@ rs6000_override_options (const char *def SUB3TARGET_OVERRIDE_OPTIONS; #endif - if (TARGET_E500) + if (TARGET_E500 || rs6000_cpu == PROCESSOR_PPCE500MC) { - /* The e500 does not have string instructions, and we set + /* The e500 and e500mc do not have string instructions, and we set MASK_STRING above when optimizing for size. */ if ((target_flags & MASK_STRING) != 0) target_flags = target_flags & ~MASK_STRING; @@ -1661,8 +2459,8 @@ rs6000_override_options (const char *def rs6000_spe = 0; if (!rs6000_explicit_options.float_gprs) rs6000_float_gprs = 0; - if (!rs6000_explicit_options.isel) - rs6000_isel = 0; + if (!(target_flags_explicit & MASK_ISEL)) + target_flags &= ~MASK_ISEL; } /* Detect invalid option combinations with E500. */ @@ -1670,13 +2468,26 @@ rs6000_override_options (const char *def rs6000_always_hint = (rs6000_cpu != PROCESSOR_POWER4 && rs6000_cpu != PROCESSOR_POWER5 - && rs6000_cpu != PROCESSOR_POWER6 + && rs6000_cpu != PROCESSOR_POWER6 + && rs6000_cpu != PROCESSOR_POWER7 && rs6000_cpu != PROCESSOR_CELL); rs6000_sched_groups = (rs6000_cpu == PROCESSOR_POWER4 - || rs6000_cpu == PROCESSOR_POWER5); + || rs6000_cpu == PROCESSOR_POWER5 + || rs6000_cpu == PROCESSOR_POWER7); rs6000_align_branch_targets = (rs6000_cpu == PROCESSOR_POWER4 - || rs6000_cpu == PROCESSOR_POWER5 - || rs6000_cpu == PROCESSOR_POWER6); + || rs6000_cpu == PROCESSOR_POWER5 + || rs6000_cpu == PROCESSOR_POWER6 + || rs6000_cpu == PROCESSOR_POWER7); + + /* Allow debug switches to override the above settings. */ + if (TARGET_ALWAYS_HINT > 0) + rs6000_always_hint = TARGET_ALWAYS_HINT; + + if (TARGET_SCHED_GROUPS > 0) + rs6000_sched_groups = TARGET_SCHED_GROUPS; + + if (TARGET_ALIGN_BRANCH_TARGETS > 0) + rs6000_align_branch_targets = TARGET_ALIGN_BRANCH_TARGETS; rs6000_sched_restricted_insns_priority = (rs6000_sched_groups ? 1 : 0); @@ -1696,7 +2507,8 @@ rs6000_override_options (const char *def else if (! strcmp (rs6000_sched_costly_dep_str, "store_to_load")) rs6000_sched_costly_dep = store_to_load_dep_costly; else - rs6000_sched_costly_dep = atoi (rs6000_sched_costly_dep_str); + rs6000_sched_costly_dep = ((enum rs6000_dependence_cost) + atoi (rs6000_sched_costly_dep_str)); } /* Handle -minsert-sched-nops option. */ @@ -1712,7 +2524,8 @@ rs6000_override_options (const char *def else if (! strcmp (rs6000_sched_insert_nops_str, "regroup_exact")) rs6000_sched_insert_nops = sched_finish_regroup_exact; else - rs6000_sched_insert_nops = atoi (rs6000_sched_insert_nops_str); + rs6000_sched_insert_nops = ((enum rs6000_nop_insertion) + atoi (rs6000_sched_insert_nops_str)); } #ifdef TARGET_REGNAMES @@ -1853,6 +2666,15 @@ rs6000_override_options (const char *def rs6000_cost = &ppc8540_cost; break; + case PROCESSOR_PPCE300C2: + case PROCESSOR_PPCE300C3: + rs6000_cost = &ppce300c2c3_cost; + break; + + case PROCESSOR_PPCE500MC: + rs6000_cost = &ppce500mc_cost; + break; + case PROCESSOR_POWER4: case PROCESSOR_POWER5: rs6000_cost = &power4_cost; @@ -1862,6 +2684,10 @@ rs6000_override_options (const char *def rs6000_cost = &power6_cost; break; + case PROCESSOR_POWER7: + rs6000_cost = &power7_cost; + break; + default: gcc_unreachable (); } @@ -1880,37 +2706,102 @@ rs6000_override_options (const char *def can be optimized to ap = __builtin_next_arg (0). */ if (DEFAULT_ABI != ABI_V4) targetm.expand_builtin_va_start = NULL; + + /* Set up single/double float flags. + If TARGET_HARD_FLOAT is set, but neither single or double is set, + then set both flags. */ + if (TARGET_HARD_FLOAT && TARGET_FPRS + && rs6000_single_float == 0 && rs6000_double_float == 0) + rs6000_single_float = rs6000_double_float = 1; + + /* Reset single and double FP flags if target is E500. */ + if (TARGET_E500) + { + rs6000_single_float = rs6000_double_float = 0; + if (TARGET_E500_SINGLE) + rs6000_single_float = 1; + if (TARGET_E500_DOUBLE) + rs6000_single_float = rs6000_double_float = 1; + } + + /* If not explicitly specified via option, decide whether to generate indexed + load/store instructions. */ + if (TARGET_AVOID_XFORM == -1) + /* Avoid indexed addressing when targeting Power6 in order to avoid + the DERAT mispredict penalty. */ + TARGET_AVOID_XFORM = (rs6000_cpu == PROCESSOR_POWER6 && TARGET_CMPB); + + rs6000_init_hard_regno_mode_ok (); } /* Implement targetm.vectorize.builtin_mask_for_load. */ static tree rs6000_builtin_mask_for_load (void) { - if (TARGET_ALTIVEC) + if (TARGET_ALTIVEC || TARGET_VSX) return altivec_builtin_mask_for_load; else return 0; } -/* Implement targetm.vectorize.builtin_conversion. */ +/* Implement targetm.vectorize.builtin_conversion. + Returns a decl of a function that implements conversion of an integer vector + into a floating-point vector, or vice-versa. TYPE is the type of the integer + side of the conversion. + Return NULL_TREE if it is not available. */ static tree -rs6000_builtin_conversion (enum tree_code code, tree type) +rs6000_builtin_conversion (unsigned int tcode, tree type) { - if (!TARGET_ALTIVEC) - return NULL_TREE; + enum tree_code code = (enum tree_code) tcode; switch (code) { + case FIX_TRUNC_EXPR: + switch (TYPE_MODE (type)) + { + case V2DImode: + if (!VECTOR_UNIT_VSX_P (V2DFmode)) + return NULL_TREE; + + return TYPE_UNSIGNED (type) + ? rs6000_builtin_decls[VSX_BUILTIN_XVCVDPUXDS_UNS] + : rs6000_builtin_decls[VSX_BUILTIN_XVCVDPSXDS]; + + case V4SImode: + if (VECTOR_UNIT_NONE_P (V4SImode) || VECTOR_UNIT_NONE_P (V4SFmode)) + return NULL_TREE; + + return TYPE_UNSIGNED (type) + ? rs6000_builtin_decls[VECTOR_BUILTIN_FIXUNS_V4SF_V4SI] + : rs6000_builtin_decls[VECTOR_BUILTIN_FIX_V4SF_V4SI]; + + default: + return NULL_TREE; + } + case FLOAT_EXPR: switch (TYPE_MODE (type)) { + case V2DImode: + if (!VECTOR_UNIT_VSX_P (V2DFmode)) + return NULL_TREE; + + return TYPE_UNSIGNED (type) + ? rs6000_builtin_decls[VSX_BUILTIN_XVCVUXDDP] + : rs6000_builtin_decls[VSX_BUILTIN_XVCVSXDDP]; + case V4SImode: - return TYPE_UNSIGNED (type) ? - rs6000_builtin_decls[ALTIVEC_BUILTIN_VCFUX] : - rs6000_builtin_decls[ALTIVEC_BUILTIN_VCFSX]; + if (VECTOR_UNIT_NONE_P (V4SImode) || VECTOR_UNIT_NONE_P (V4SFmode)) + return NULL_TREE; + + return TYPE_UNSIGNED (type) + ? rs6000_builtin_decls[VECTOR_BUILTIN_UNSFLOAT_V4SI_V4SF] + : rs6000_builtin_decls[VECTOR_BUILTIN_FLOAT_V4SI_V4SF]; + default: return NULL_TREE; } + default: return NULL_TREE; } @@ -1926,14 +2817,14 @@ rs6000_builtin_mul_widen_even (tree type switch (TYPE_MODE (type)) { case V8HImode: - return TYPE_UNSIGNED (type) ? - rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULEUH] : - rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULESH]; + return TYPE_UNSIGNED (type) + ? rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULEUH_UNS] + : rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULESH]; case V16QImode: - return TYPE_UNSIGNED (type) ? - rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULEUB] : - rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULESB]; + return TYPE_UNSIGNED (type) + ? rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULEUB_UNS] + : rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULESB]; default: return NULL_TREE; } @@ -1949,14 +2840,14 @@ rs6000_builtin_mul_widen_odd (tree type) switch (TYPE_MODE (type)) { case V8HImode: - return TYPE_UNSIGNED (type) ? - rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULOUH] : - rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULOSH]; + return TYPE_UNSIGNED (type) + ? rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULOUH_UNS] + : rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULOSH]; case V16QImode: - return TYPE_UNSIGNED (type) ? - rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULOUB] : - rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULOSB]; + return TYPE_UNSIGNED (type) + ? rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULOUB_UNS] + : rs6000_builtin_decls[ALTIVEC_BUILTIN_VMULOSB]; default: return NULL_TREE; } @@ -1993,6 +2884,37 @@ rs6000_vector_alignment_reachable (const } } +/* Return true if the vector misalignment factor is supported by the + target. */ +bool +rs6000_builtin_support_vector_misalignment (enum machine_mode mode, + const_tree type, + int misalignment, + bool is_packed) +{ + if (TARGET_VSX) + { + /* Return if movmisalign pattern is not supported for this mode. */ + if (optab_handler (movmisalign_optab, mode)->insn_code == + CODE_FOR_nothing) + return false; + + if (misalignment == -1) + { + /* misalignment factor is unknown at compile time but we know + it's word aligned. */ + if (rs6000_vector_alignment_reachable (type, is_packed)) + return true; + return false; + } + /* VSX supports word-aligned vector. */ + if (misalignment % 4 == 0) + return true; + } + return false; +} + + /* Handle generic options of the form -mfoo=yes/no. NAME is the option name. VALUE is the option value. @@ -2043,14 +2965,159 @@ optimization_options (int level ATTRIBUT Skip section anchors for Objective C and Objective C++ until front-ends fixed. */ if (!TARGET_MACHO && lang_hooks.name[4] != 'O') - flag_section_anchors = 1; + flag_section_anchors = 2; +} + +static enum fpu_type_t +rs6000_parse_fpu_option (const char *option) +{ + if (!strcmp("none", option)) return FPU_NONE; + if (!strcmp("sp_lite", option)) return FPU_SF_LITE; + if (!strcmp("dp_lite", option)) return FPU_DF_LITE; + if (!strcmp("sp_full", option)) return FPU_SF_FULL; + if (!strcmp("dp_full", option)) return FPU_DF_FULL; + error("unknown value %s for -mfpu", option); + return FPU_NONE; } +/* Returns a function decl for a vectorized version of the builtin function + with builtin function code FN and the result vector type TYPE, or NULL_TREE + if it is not available. */ + +static tree +rs6000_builtin_vectorized_function (unsigned int fn, tree type_out, + tree type_in) +{ + enum machine_mode in_mode, out_mode; + int in_n, out_n; + + if (TREE_CODE (type_out) != VECTOR_TYPE + || TREE_CODE (type_in) != VECTOR_TYPE + || !TARGET_VECTORIZE_BUILTINS) + return NULL_TREE; + + out_mode = TYPE_MODE (TREE_TYPE (type_out)); + out_n = TYPE_VECTOR_SUBPARTS (type_out); + in_mode = TYPE_MODE (TREE_TYPE (type_in)); + in_n = TYPE_VECTOR_SUBPARTS (type_in); + + switch (fn) + { + case BUILT_IN_COPYSIGN: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls[VSX_BUILTIN_CPSGNDP]; + break; + case BUILT_IN_COPYSIGNF: + if (out_mode != SFmode || out_n != 4 + || in_mode != SFmode || in_n != 4) + break; + if (VECTOR_UNIT_VSX_P (V4SFmode)) + return rs6000_builtin_decls[VSX_BUILTIN_CPSGNSP]; + if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)) + return rs6000_builtin_decls[ALTIVEC_BUILTIN_COPYSIGN_V4SF]; + break; + case BUILT_IN_SQRT: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls[VSX_BUILTIN_XVSQRTDP]; + break; + case BUILT_IN_SQRTF: + if (VECTOR_UNIT_VSX_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls[VSX_BUILTIN_XVSQRTSP]; + break; + case BUILT_IN_CEIL: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls[VSX_BUILTIN_XVRDPIP]; + break; + case BUILT_IN_CEILF: + if (out_mode != SFmode || out_n != 4 + || in_mode != SFmode || in_n != 4) + break; + if (VECTOR_UNIT_VSX_P (V4SFmode)) + return rs6000_builtin_decls[VSX_BUILTIN_XVRSPIP]; + if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)) + return rs6000_builtin_decls[ALTIVEC_BUILTIN_VRFIP]; + break; + case BUILT_IN_FLOOR: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls[VSX_BUILTIN_XVRDPIM]; + break; + case BUILT_IN_FLOORF: + if (out_mode != SFmode || out_n != 4 + || in_mode != SFmode || in_n != 4) + break; + if (VECTOR_UNIT_VSX_P (V4SFmode)) + return rs6000_builtin_decls[VSX_BUILTIN_XVRSPIM]; + if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)) + return rs6000_builtin_decls[ALTIVEC_BUILTIN_VRFIM]; + break; + case BUILT_IN_TRUNC: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls[VSX_BUILTIN_XVRDPIZ]; + break; + case BUILT_IN_TRUNCF: + if (out_mode != SFmode || out_n != 4 + || in_mode != SFmode || in_n != 4) + break; + if (VECTOR_UNIT_VSX_P (V4SFmode)) + return rs6000_builtin_decls[VSX_BUILTIN_XVRSPIZ]; + if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)) + return rs6000_builtin_decls[ALTIVEC_BUILTIN_VRFIZ]; + break; + case BUILT_IN_NEARBYINT: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && flag_unsafe_math_optimizations + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls[VSX_BUILTIN_XVRDPI]; + break; + case BUILT_IN_NEARBYINTF: + if (VECTOR_UNIT_VSX_P (V4SFmode) + && flag_unsafe_math_optimizations + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls[VSX_BUILTIN_XVRSPI]; + break; + case BUILT_IN_RINT: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && !flag_trapping_math + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls[VSX_BUILTIN_XVRDPIC]; + break; + case BUILT_IN_RINTF: + if (VECTOR_UNIT_VSX_P (V4SFmode) + && !flag_trapping_math + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls[VSX_BUILTIN_XVRSPIC]; + break; + default: + break; + } + return NULL_TREE; +} + + /* Implement TARGET_HANDLE_OPTION. */ static bool rs6000_handle_option (size_t code, const char *arg, int value) { + enum fpu_type_t fpu_type = FPU_NONE; + int isel; + switch (code) { case OPT_mno_power: @@ -2147,14 +3214,29 @@ rs6000_handle_option (size_t code, const rs6000_explicit_options.aix_struct_ret = true; break; + case OPT_mvrsave: + rs6000_explicit_options.vrsave = true; + TARGET_ALTIVEC_VRSAVE = value; + break; + case OPT_mvrsave_: rs6000_explicit_options.vrsave = true; rs6000_parse_yes_no_option ("vrsave", arg, &(TARGET_ALTIVEC_VRSAVE)); break; case OPT_misel_: - rs6000_explicit_options.isel = true; - rs6000_parse_yes_no_option ("isel", arg, &(rs6000_isel)); + target_flags_explicit |= MASK_ISEL; + isel = 0; + rs6000_parse_yes_no_option ("isel", arg, &isel); + if (isel) + target_flags |= MASK_ISEL; + else + target_flags &= ~MASK_ISEL; + break; + + case OPT_mspe: + rs6000_explicit_options.spe = true; + rs6000_spe = value; break; case OPT_mspe_: @@ -2329,6 +3411,61 @@ rs6000_handle_option (size_t code, const return false; } break; + + case OPT_msingle_float: + if (!TARGET_SINGLE_FPU) + warning (0, "-msingle-float option equivalent to -mhard-float"); + /* -msingle-float implies -mno-double-float and TARGET_HARD_FLOAT. */ + rs6000_double_float = 0; + target_flags &= ~MASK_SOFT_FLOAT; + target_flags_explicit |= MASK_SOFT_FLOAT; + break; + + case OPT_mdouble_float: + /* -mdouble-float implies -msingle-float and TARGET_HARD_FLOAT. */ + rs6000_single_float = 1; + target_flags &= ~MASK_SOFT_FLOAT; + target_flags_explicit |= MASK_SOFT_FLOAT; + break; + + case OPT_msimple_fpu: + if (!TARGET_SINGLE_FPU) + warning (0, "-msimple-fpu option ignored"); + break; + + case OPT_mhard_float: + /* -mhard_float implies -msingle-float and -mdouble-float. */ + rs6000_single_float = rs6000_double_float = 1; + break; + + case OPT_msoft_float: + /* -msoft_float implies -mnosingle-float and -mnodouble-float. */ + rs6000_single_float = rs6000_double_float = 0; + break; + + case OPT_mfpu_: + fpu_type = rs6000_parse_fpu_option(arg); + if (fpu_type != FPU_NONE) + /* If -mfpu is not none, then turn off SOFT_FLOAT, turn on HARD_FLOAT. */ + { + target_flags &= ~MASK_SOFT_FLOAT; + target_flags_explicit |= MASK_SOFT_FLOAT; + rs6000_xilinx_fpu = 1; + if (fpu_type == FPU_SF_LITE || fpu_type == FPU_SF_FULL) + rs6000_single_float = 1; + if (fpu_type == FPU_DF_LITE || fpu_type == FPU_DF_FULL) + rs6000_single_float = rs6000_double_float = 1; + if (fpu_type == FPU_SF_LITE || fpu_type == FPU_DF_LITE) + rs6000_simple_fpu = 1; + } + else + { + /* -mfpu=none is equivalent to -msoft-float */ + target_flags |= MASK_SOFT_FLOAT; + target_flags_explicit |= MASK_SOFT_FLOAT; + rs6000_single_float = rs6000_double_float = 0; + } + break; } return true; } @@ -2398,11 +3535,16 @@ rs6000_file_start (void) if (TARGET_32BIT && DEFAULT_ABI == ABI_V4) { fprintf (file, "\t.gnu_attribute 4, %d\n", - (TARGET_HARD_FLOAT && TARGET_FPRS) ? 1 : 2); + ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT) ? 1 + : (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT) ? 3 + : 2)); fprintf (file, "\t.gnu_attribute 8, %d\n", (TARGET_ALTIVEC_ABI ? 2 : TARGET_SPE_ABI ? 3 : 1)); + fprintf (file, "\t.gnu_attribute 12, %d\n", + aix_struct_return ? 2 : 1); + } #endif @@ -2608,6 +3750,11 @@ vspltis_constant (rtx op, unsigned step, && (splat_val >= 0 || (step == 1 && copies == 1))) ; + /* Also check if are loading up the most significant bit which can be done by + loading up -1 and shifting the value left by -1. */ + else if (EASY_VECTOR_MSB (splat_val, inner)) + ; + else return false; @@ -2718,6 +3865,9 @@ output_vec_const_move (rtx *operands) vec = operands[1]; mode = GET_MODE (dest); + if (TARGET_VSX && zero_constant (vec, mode)) + return "xxlxor %x0,%x0,%x0"; + if (TARGET_ALTIVEC) { rtx splat_vec; @@ -2771,7 +3921,7 @@ paired_expand_vector_init (rtx target, r enum machine_mode mode = GET_MODE (target); int n_elts = GET_MODE_NUNITS (mode); int n_var = 0; - rtx x, new, tmp, constant_op, op1, op2; + rtx x, new_rtx, tmp, constant_op, op1, op2; int i; for (i = 0; i < n_elts; ++i) @@ -2790,10 +3940,10 @@ paired_expand_vector_init (rtx target, r if (n_var == 2) { /* The vector is initialized only with non-constants. */ - new = gen_rtx_VEC_CONCAT (V2SFmode, XVECEXP (vals, 0, 0), + new_rtx = gen_rtx_VEC_CONCAT (V2SFmode, XVECEXP (vals, 0, 0), XVECEXP (vals, 0, 1)); - emit_move_insn (target, new); + emit_move_insn (target, new_rtx); return; } @@ -2809,11 +3959,11 @@ paired_expand_vector_init (rtx target, r emit_move_insn (tmp, constant_op); if (CONSTANT_P (op1)) - new = gen_rtx_VEC_CONCAT (V2SFmode, tmp, op2); + new_rtx = gen_rtx_VEC_CONCAT (V2SFmode, tmp, op2); else - new = gen_rtx_VEC_CONCAT (V2SFmode, op1, tmp); + new_rtx = gen_rtx_VEC_CONCAT (V2SFmode, op1, tmp); - emit_move_insn (target, new); + emit_move_insn (target, new_rtx); } void @@ -2940,31 +4090,89 @@ rs6000_expand_vector_init (rtx target, r if (n_var == 0) { - if (mode != V4SFmode && all_const_zero) + rtx const_vec = gen_rtx_CONST_VECTOR (mode, XVEC (vals, 0)); + bool int_vector_p = (GET_MODE_CLASS (mode) == MODE_VECTOR_INT); + if ((int_vector_p || TARGET_VSX) && all_const_zero) { /* Zero register. */ emit_insn (gen_rtx_SET (VOIDmode, target, gen_rtx_XOR (mode, target, target))); return; } - else if (mode != V4SFmode && easy_vector_constant (vals, mode)) + else if (int_vector_p && easy_vector_constant (const_vec, mode)) { /* Splat immediate. */ - emit_insn (gen_rtx_SET (VOIDmode, target, vals)); + emit_insn (gen_rtx_SET (VOIDmode, target, const_vec)); return; } - else if (all_same) - ; /* Splat vector element. */ else { /* Load from constant pool. */ - emit_move_insn (target, gen_rtx_CONST_VECTOR (mode, XVEC (vals, 0))); + emit_move_insn (target, const_vec); return; } } - /* Store value to stack temp. Load vector element. Splat. */ - if (all_same) + /* Double word values on VSX can use xxpermdi or lxvdsx. */ + if (VECTOR_MEM_VSX_P (mode) && (mode == V2DFmode || mode == V2DImode)) + { + if (all_same) + { + rtx element = XVECEXP (vals, 0, 0); + if (mode == V2DFmode) + emit_insn (gen_vsx_splat_v2df (target, element)); + else + emit_insn (gen_vsx_splat_v2di (target, element)); + } + else + { + rtx op0 = copy_to_reg (XVECEXP (vals, 0, 0)); + rtx op1 = copy_to_reg (XVECEXP (vals, 0, 1)); + if (mode == V2DFmode) + emit_insn (gen_vsx_concat_v2df (target, op0, op1)); + else + emit_insn (gen_vsx_concat_v2di (target, op0, op1)); + } + return; + } + + /* With single precision floating point on VSX, know that internally single + precision is actually represented as a double, and either make 2 V2DF + vectors, and convert these vectors to single precision, or do one + conversion, and splat the result to the other elements. */ + if (mode == V4SFmode && VECTOR_MEM_VSX_P (mode)) + { + if (all_same) + { + rtx freg = gen_reg_rtx (V4SFmode); + rtx sreg = copy_to_reg (XVECEXP (vals, 0, 0)); + + emit_insn (gen_vsx_xscvdpsp_scalar (freg, sreg)); + emit_insn (gen_vsx_xxspltw_v4sf (target, freg, const0_rtx)); + } + else + { + rtx dbl_even = gen_reg_rtx (V2DFmode); + rtx dbl_odd = gen_reg_rtx (V2DFmode); + rtx flt_even = gen_reg_rtx (V4SFmode); + rtx flt_odd = gen_reg_rtx (V4SFmode); + + emit_insn (gen_vsx_concat_v2sf (dbl_even, + copy_to_reg (XVECEXP (vals, 0, 0)), + copy_to_reg (XVECEXP (vals, 0, 1)))); + emit_insn (gen_vsx_concat_v2sf (dbl_odd, + copy_to_reg (XVECEXP (vals, 0, 2)), + copy_to_reg (XVECEXP (vals, 0, 3)))); + emit_insn (gen_vsx_xvcvdpsp (flt_even, dbl_even)); + emit_insn (gen_vsx_xvcvdpsp (flt_odd, dbl_odd)); + emit_insn (gen_vec_extract_evenv4sf (target, flt_even, flt_odd)); + } + return; + } + + /* Store value to stack temp. Load vector element. Splat. However, splat + of 64-bit items is not supported on Altivec. */ + if (all_same && GET_MODE_SIZE (mode) <= 4) { mem = assign_stack_temp (mode, GET_MODE_SIZE (inner_mode), 0); emit_move_insn (adjust_address_nv (mem, inner_mode, 0), @@ -3022,6 +4230,14 @@ rs6000_expand_vector_set (rtx target, rt int width = GET_MODE_SIZE (inner_mode); int i; + if (VECTOR_MEM_VSX_P (mode) && (mode == V2DFmode || mode == V2DImode)) + { + rtx (*set_func) (rtx, rtx, rtx, rtx) + = ((mode == V2DFmode) ? gen_vsx_set_v2df : gen_vsx_set_v2di); + emit_insn (set_func (target, target, val, GEN_INT (elt))); + return; + } + /* Load single variable value. */ mem = assign_stack_temp (mode, GET_MODE_SIZE (inner_mode), 0); emit_move_insn (adjust_address_nv (mem, inner_mode, 0), val); @@ -3059,6 +4275,14 @@ rs6000_expand_vector_extract (rtx target enum machine_mode inner_mode = GET_MODE_INNER (mode); rtx mem, x; + if (VECTOR_MEM_VSX_P (mode) && (mode == V2DFmode || mode == V2DImode)) + { + rtx (*extract_func) (rtx, rtx, rtx) + = ((mode == V2DFmode) ? gen_vsx_extract_v2df : gen_vsx_extract_v2di); + emit_insn (extract_func (target, vec, GEN_INT (elt))); + return; + } + /* Allocate mode-sized buffer. */ mem = assign_stack_temp (mode, GET_MODE_SIZE (mode), 0); @@ -3153,24 +4377,26 @@ invalid_e500_subreg (rtx op, enum machin if (TARGET_E500_DOUBLE) { /* Reject (subreg:SI (reg:DF)); likewise with subreg:DI or - subreg:TI and reg:TF. */ + subreg:TI and reg:TF. Decimal float modes are like integer + modes (only low part of each register used) for this + purpose. */ if (GET_CODE (op) == SUBREG - && (mode == SImode || mode == DImode || mode == TImode) + && (mode == SImode || mode == DImode || mode == TImode + || mode == DDmode || mode == TDmode) && REG_P (SUBREG_REG (op)) && (GET_MODE (SUBREG_REG (op)) == DFmode - || GET_MODE (SUBREG_REG (op)) == TFmode - || GET_MODE (SUBREG_REG (op)) == DDmode - || GET_MODE (SUBREG_REG (op)) == TDmode)) + || GET_MODE (SUBREG_REG (op)) == TFmode)) return true; /* Reject (subreg:DF (reg:DI)); likewise with subreg:TF and reg:TI. */ if (GET_CODE (op) == SUBREG - && (mode == DFmode || mode == TFmode - || mode == DDmode || mode == TDmode) + && (mode == DFmode || mode == TFmode) && REG_P (SUBREG_REG (op)) && (GET_MODE (SUBREG_REG (op)) == DImode - || GET_MODE (SUBREG_REG (op)) == TImode)) + || GET_MODE (SUBREG_REG (op)) == TImode + || GET_MODE (SUBREG_REG (op)) == DDmode + || GET_MODE (SUBREG_REG (op)) == TDmode)) return true; } @@ -3303,60 +4529,82 @@ gpr_or_gpr_p (rtx op0, rtx op1) } -/* Subroutines of rs6000_legitimize_address and rs6000_legitimate_address. */ +/* Subroutines of rs6000_legitimize_address and rs6000_legitimate_address_p. */ -static int -constant_pool_expr_1 (rtx op, int *have_sym, int *have_toc) +static bool +reg_offset_addressing_ok_p (enum machine_mode mode) { - switch (GET_CODE (op)) + switch (mode) { - case SYMBOL_REF: - if (RS6000_SYMBOL_REF_TLS_P (op)) - return 0; - else if (CONSTANT_POOL_ADDRESS_P (op)) - { - if (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (op), Pmode)) - { - *have_sym = 1; - return 1; - } - else - return 0; - } - else if (! strcmp (XSTR (op, 0), toc_label_name)) - { - *have_toc = 1; - return 1; - } - else - return 0; - case PLUS: - case MINUS: - return (constant_pool_expr_1 (XEXP (op, 0), have_sym, have_toc) - && constant_pool_expr_1 (XEXP (op, 1), have_sym, have_toc)); - case CONST: - return constant_pool_expr_1 (XEXP (op, 0), have_sym, have_toc); - case CONST_INT: - return 1; + case V16QImode: + case V8HImode: + case V4SFmode: + case V4SImode: + case V2DFmode: + case V2DImode: + /* AltiVec/VSX vector modes. Only reg+reg addressing is valid. */ + if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)) + return false; + break; + + case V4HImode: + case V2SImode: + case V1DImode: + case V2SFmode: + /* Paired vector modes. Only reg+reg addressing is valid. */ + if (TARGET_PAIRED_FLOAT) + return false; + break; + default: - return 0; + break; } + + return true; +} + +static bool +virtual_stack_registers_memory_p (rtx op) +{ + int regnum; + + if (GET_CODE (op) == REG) + regnum = REGNO (op); + + else if (GET_CODE (op) == PLUS + && GET_CODE (XEXP (op, 0)) == REG + && GET_CODE (XEXP (op, 1)) == CONST_INT) + regnum = REGNO (XEXP (op, 0)); + + else + return false; + + return (regnum >= FIRST_VIRTUAL_REGISTER + && regnum <= LAST_VIRTUAL_REGISTER); } static bool constant_pool_expr_p (rtx op) { - int have_sym = 0; - int have_toc = 0; - return constant_pool_expr_1 (op, &have_sym, &have_toc) && have_sym; + rtx base, offset; + + split_const (op, &base, &offset); + return (GET_CODE (base) == SYMBOL_REF + && CONSTANT_POOL_ADDRESS_P (base) + && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (base), Pmode)); } bool toc_relative_expr_p (rtx op) { - int have_sym = 0; - int have_toc = 0; - return constant_pool_expr_1 (op, &have_sym, &have_toc) && have_toc; + rtx base, offset; + + if (GET_CODE (op) != CONST) + return false; + + split_const (op, &base, &offset); + return (GET_CODE (base) == UNSPEC + && XINT (base, 1) == UNSPEC_TOCREL); } bool @@ -3366,7 +4614,7 @@ legitimate_constant_pool_address_p (rtx && GET_CODE (x) == PLUS && GET_CODE (XEXP (x, 0)) == REG && (TARGET_MINIMAL_TOC || REGNO (XEXP (x, 0)) == TOC_REGISTER) - && constant_pool_expr_p (XEXP (x, 1))); + && toc_relative_expr_p (XEXP (x, 1))); } static bool @@ -3392,6 +4640,8 @@ rs6000_legitimate_offset_address_p (enum return false; if (!INT_REG_OK_FOR_BASE_P (XEXP (x, 0), strict)) return false; + if (!reg_offset_addressing_ok_p (mode)) + return virtual_stack_registers_memory_p (x); if (legitimate_constant_pool_address_p (x)) return true; if (GET_CODE (XEXP (x, 1)) != CONST_INT) @@ -3401,30 +4651,23 @@ rs6000_legitimate_offset_address_p (enum extra = 0; switch (mode) { - case V16QImode: - case V8HImode: - case V4SFmode: - case V4SImode: - /* AltiVec vector modes. Only reg+reg addressing is valid and - constant offset zero should not occur due to canonicalization. */ - return false; - case V4HImode: case V2SImode: case V1DImode: case V2SFmode: - /* Paired vector modes. Only reg+reg addressing is valid and - constant offset zero should not occur due to canonicalization. */ - if (TARGET_PAIRED_FLOAT) - return false; /* SPE vector modes. */ return SPE_CONST_OFFSET_OK (offset); case DFmode: - case DDmode: if (TARGET_E500_DOUBLE) return SPE_CONST_OFFSET_OK (offset); + /* If we are using VSX scalar loads, restrict ourselves to reg+reg + addressing. */ + if (VECTOR_MEM_VSX_P (DFmode)) + return false; + + case DDmode: case DImode: /* On e500v2, we may have: @@ -3441,11 +4684,11 @@ rs6000_legitimate_offset_address_p (enum break; case TFmode: - case TDmode: if (TARGET_E500_DOUBLE) return (SPE_CONST_OFFSET_OK (offset) && SPE_CONST_OFFSET_OK (offset + 8)); + case TDmode: case TImode: if (mode == TFmode || mode == TDmode || !TARGET_POWERPC64) extra = 12; @@ -3489,6 +4732,14 @@ legitimate_indexed_address_p (rtx x, int && INT_REG_OK_FOR_INDEX_P (op0, strict)))); } +bool +avoiding_indexed_address_p (enum machine_mode mode) +{ + /* Avoid indexed addressing for modes that have non-indexed + load/store instruction forms. */ + return (TARGET_AVOID_XFORM && VECTOR_MEM_NONE_P (mode)); +} + inline bool legitimate_indirect_address_p (rtx x, int strict) { @@ -3540,7 +4791,7 @@ legitimate_lo_sum_address_p (enum machin return false; if (GET_MODE_BITSIZE (mode) > 64 || (GET_MODE_BITSIZE (mode) > 32 && !TARGET_POWERPC64 - && !(TARGET_HARD_FLOAT && TARGET_FPRS + && !(TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && (mode == DFmode || mode == DDmode)))) return false; @@ -3559,8 +4810,6 @@ legitimate_lo_sum_address_p (enum machin called. In some cases it is useful to look at this to decide what needs to be done. - MODE is passed so that this function can use GO_IF_LEGITIMATE_ADDRESS. - It is always safe for this function to do nothing. It exists to recognize opportunities to optimize the output. @@ -3574,10 +4823,30 @@ legitimate_lo_sum_address_p (enum machin Then check for the sum of a register and something not constant, try to load the other things into a register and return the sum. */ -rtx +static rtx rs6000_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED, enum machine_mode mode) { + unsigned int extra = 0; + + if (!reg_offset_addressing_ok_p (mode)) + { + if (virtual_stack_registers_memory_p (x)) + return x; + + /* In theory we should not be seeing addresses of the form reg+0, + but just in case it is generated, optimize it away. */ + if (GET_CODE (x) == PLUS && XEXP (x, 1) == const0_rtx) + return force_reg (Pmode, XEXP (x, 0)); + + /* Make sure both operands are registers. */ + else if (GET_CODE (x) == PLUS) + return gen_rtx_PLUS (Pmode, + force_reg (Pmode, XEXP (x, 0)), + force_reg (Pmode, XEXP (x, 1))); + else + return force_reg (Pmode, x); + } if (GET_CODE (x) == SYMBOL_REF) { enum tls_model model = SYMBOL_REF_TLS_MODEL (x); @@ -3585,50 +4854,66 @@ rs6000_legitimize_address (rtx x, rtx ol return rs6000_legitimize_tls_address (x, model); } + switch (mode) + { + case DFmode: + case DDmode: + extra = 4; + break; + case DImode: + if (!TARGET_POWERPC64) + extra = 4; + break; + case TFmode: + case TDmode: + extra = 12; + break; + case TImode: + extra = TARGET_POWERPC64 ? 8 : 12; + break; + default: + break; + } + if (GET_CODE (x) == PLUS && GET_CODE (XEXP (x, 0)) == REG && GET_CODE (XEXP (x, 1)) == CONST_INT - && (unsigned HOST_WIDE_INT) (INTVAL (XEXP (x, 1)) + 0x8000) >= 0x10000 - && !(SPE_VECTOR_MODE (mode) - || ALTIVEC_VECTOR_MODE (mode) + && ((unsigned HOST_WIDE_INT) (INTVAL (XEXP (x, 1)) + 0x8000) + >= 0x10000 - extra) + && !((TARGET_POWERPC64 + && (mode == DImode || mode == TImode) + && (INTVAL (XEXP (x, 1)) & 3) != 0) + || SPE_VECTOR_MODE (mode) || (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode - || mode == DImode)))) + || mode == DImode || mode == DDmode + || mode == TDmode)))) { HOST_WIDE_INT high_int, low_int; rtx sum; low_int = ((INTVAL (XEXP (x, 1)) & 0xffff) ^ 0x8000) - 0x8000; + if (low_int >= 0x8000 - extra) + low_int = 0; high_int = INTVAL (XEXP (x, 1)) - low_int; sum = force_operand (gen_rtx_PLUS (Pmode, XEXP (x, 0), GEN_INT (high_int)), 0); - return gen_rtx_PLUS (Pmode, sum, GEN_INT (low_int)); + return plus_constant (sum, low_int); } else if (GET_CODE (x) == PLUS && GET_CODE (XEXP (x, 0)) == REG && GET_CODE (XEXP (x, 1)) != CONST_INT && GET_MODE_NUNITS (mode) == 1 - && ((TARGET_HARD_FLOAT && TARGET_FPRS) + && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_POWERPC64 || ((mode != DImode && mode != DFmode && mode != DDmode) - || TARGET_E500_DOUBLE)) + || (TARGET_E500_DOUBLE && mode != DDmode))) && (TARGET_POWERPC64 || mode != DImode) + && !avoiding_indexed_address_p (mode) && mode != TImode && mode != TFmode && mode != TDmode) { return gen_rtx_PLUS (Pmode, XEXP (x, 0), - force_reg (Pmode, force_operand (XEXP (x, 1), 0))); - } - else if (ALTIVEC_VECTOR_MODE (mode)) - { - rtx reg; - - /* Make sure both operands are registers. */ - if (GET_CODE (x) == PLUS) - return gen_rtx_PLUS (Pmode, force_reg (Pmode, XEXP (x, 0)), - force_reg (Pmode, XEXP (x, 1))); - - reg = force_reg (Pmode, x); - return reg; + force_reg (Pmode, force_operand (XEXP (x, 1), 0))); } else if (SPE_VECTOR_MODE (mode) || (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode @@ -3636,7 +4921,7 @@ rs6000_legitimize_address (rtx x, rtx ol || mode == DImode))) { if (mode == DImode) - return NULL_RTX; + return x; /* We accept [reg + reg] and [reg + OFFSET]. */ if (GET_CODE (x) == PLUS) @@ -3658,7 +4943,7 @@ rs6000_legitimize_address (rtx x, rtx ol reg + offset] is not a legitimate addressing mode. */ y = gen_rtx_PLUS (Pmode, op1, op2); - if (GET_MODE_SIZE (mode) > 8 && REG_P (op2)) + if ((GET_MODE_SIZE (mode) > 8 || mode == DDmode) && REG_P (op2)) return force_reg (Pmode, y); else return y; @@ -3675,7 +4960,7 @@ rs6000_legitimize_address (rtx x, rtx ol && CONSTANT_P (x) && GET_MODE_NUNITS (mode) == 1 && (GET_MODE_BITSIZE (mode) <= 32 - || ((TARGET_HARD_FLOAT && TARGET_FPRS) + || ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT) && (mode == DFmode || mode == DDmode)))) { rtx reg = gen_reg_rtx (Pmode); @@ -3690,7 +4975,8 @@ rs6000_legitimize_address (rtx x, rtx ol && GET_CODE (x) != CONST_INT && GET_CODE (x) != CONST_DOUBLE && CONSTANT_P (x) - && ((TARGET_HARD_FLOAT && TARGET_FPRS) + && GET_MODE_NUNITS (mode) == 1 + && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT) || (mode != DFmode && mode != DDmode)) && mode != DImode && mode != TImode) @@ -3700,13 +4986,64 @@ rs6000_legitimize_address (rtx x, rtx ol return gen_rtx_LO_SUM (Pmode, reg, x); } else if (TARGET_TOC + && GET_CODE (x) == SYMBOL_REF && constant_pool_expr_p (x) && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), Pmode)) { return create_TOC_reference (x); } else - return NULL_RTX; + return x; +} + +/* Debug version of rs6000_legitimize_address. */ +static rtx +rs6000_debug_legitimize_address (rtx x, rtx oldx, enum machine_mode mode) +{ + rtx ret; + rtx insns; + + start_sequence (); + ret = rs6000_legitimize_address (x, oldx, mode); + insns = get_insns (); + end_sequence (); + + if (ret != x) + { + fprintf (stderr, + "\nrs6000_legitimize_address: mode %s, old code %s, " + "new code %s, modified\n", + GET_MODE_NAME (mode), GET_RTX_NAME (GET_CODE (x)), + GET_RTX_NAME (GET_CODE (ret))); + + fprintf (stderr, "Original address:\n"); + debug_rtx (x); + + fprintf (stderr, "oldx:\n"); + debug_rtx (oldx); + + fprintf (stderr, "New address:\n"); + debug_rtx (ret); + + if (insns) + { + fprintf (stderr, "Insns added:\n"); + debug_rtx_list (insns, 20); + } + } + else + { + fprintf (stderr, + "\nrs6000_legitimize_address: mode %s, code %s, no change:\n", + GET_MODE_NAME (mode), GET_RTX_NAME (GET_CODE (x))); + + debug_rtx (x); + } + + if (insns) + emit_insn (insns); + + return ret; } /* This is called from dwarf2out.c via TARGET_ASM_OUTPUT_DWARF_DTPREL. @@ -3843,7 +5180,6 @@ rs6000_legitimize_tls_address (rtx addr, emit_insn (gen_addsi3 (tmp3, tmp1, tmp2)); last = emit_move_insn (got, tmp3); set_unique_reg_note (last, REG_EQUAL, gsym); - maybe_encapsulate_block (first, last, gsym); } } } @@ -3851,17 +5187,23 @@ rs6000_legitimize_tls_address (rtx addr, if (model == TLS_MODEL_GLOBAL_DYNAMIC) { r3 = gen_rtx_REG (Pmode, 3); - if (TARGET_64BIT) - insn = gen_tls_gd_64 (r3, got, addr); + tga = rs6000_tls_get_addr (); + + if (DEFAULT_ABI == ABI_AIX && TARGET_64BIT) + insn = gen_tls_gd_aix64 (r3, got, addr, tga, const0_rtx); + else if (DEFAULT_ABI == ABI_AIX && !TARGET_64BIT) + insn = gen_tls_gd_aix32 (r3, got, addr, tga, const0_rtx); + else if (DEFAULT_ABI == ABI_V4) + insn = gen_tls_gd_sysvsi (r3, got, addr, tga, const0_rtx); else - insn = gen_tls_gd_32 (r3, got, addr); + gcc_unreachable (); + start_sequence (); - emit_insn (insn); - tga = gen_rtx_MEM (Pmode, rs6000_tls_get_addr ()); - insn = gen_call_value (r3, tga, const0_rtx, const0_rtx); insn = emit_call_insn (insn); CONST_OR_PURE_CALL_P (insn) = 1; use_reg (&CALL_INSN_FUNCTION_USAGE (insn), r3); + if (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT && flag_pic) + use_reg (&CALL_INSN_FUNCTION_USAGE (insn), pic_offset_table_rtx); insn = get_insns (); end_sequence (); emit_libcall_block (insn, dest, r3, addr); @@ -3869,17 +5211,23 @@ rs6000_legitimize_tls_address (rtx addr, else if (model == TLS_MODEL_LOCAL_DYNAMIC) { r3 = gen_rtx_REG (Pmode, 3); - if (TARGET_64BIT) - insn = gen_tls_ld_64 (r3, got); + tga = rs6000_tls_get_addr (); + + if (DEFAULT_ABI == ABI_AIX && TARGET_64BIT) + insn = gen_tls_ld_aix64 (r3, got, tga, const0_rtx); + else if (DEFAULT_ABI == ABI_AIX && !TARGET_64BIT) + insn = gen_tls_ld_aix32 (r3, got, tga, const0_rtx); + else if (DEFAULT_ABI == ABI_V4) + insn = gen_tls_ld_sysvsi (r3, got, tga, const0_rtx); else - insn = gen_tls_ld_32 (r3, got); + gcc_unreachable (); + start_sequence (); - emit_insn (insn); - tga = gen_rtx_MEM (Pmode, rs6000_tls_get_addr ()); - insn = gen_call_value (r3, tga, const0_rtx, const0_rtx); insn = emit_call_insn (insn); CONST_OR_PURE_CALL_P (insn) = 1; use_reg (&CALL_INSN_FUNCTION_USAGE (insn), r3); + if (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT && flag_pic) + use_reg (&CALL_INSN_FUNCTION_USAGE (insn), pic_offset_table_rtx); insn = get_insns (); end_sequence (); tmp1 = gen_reg_rtx (Pmode); @@ -3959,13 +5307,6 @@ rs6000_tls_symbol_ref_1 (rtx *x, void *d return RS6000_SYMBOL_REF_TLS_P (*x); } -/* The convention appears to be to define this wherever it is used. - With legitimize_reload_address now defined here, REG_MODE_OK_FOR_BASE_P - is now used here. */ -#ifndef REG_MODE_OK_FOR_BASE_P -#define REG_MODE_OK_FOR_BASE_P(REGNO, MODE) REG_OK_FOR_BASE_P (REGNO) -#endif - /* Our implementation of LEGITIMIZE_RELOAD_ADDRESS. Returns a value to replace the input X, or the original X if no replacement is called for. The output parameter *WIN is 1 if the calling macro should goto WIN, @@ -3977,13 +5318,15 @@ rs6000_tls_symbol_ref_1 (rtx *x, void *d On Darwin, we use this to generate code for floating point constants. A movsf_low is generated so we wind up with 2 instructions rather than 3. - The Darwin code is inside #if TARGET_MACHO because only then is - machopic_function_base_name() defined. */ -rtx + The Darwin code is inside #if TARGET_MACHO because only then are the + machopic_* functions defined. */ +static rtx rs6000_legitimize_reload_address (rtx x, enum machine_mode mode, int opnum, int type, int ind_levels ATTRIBUTE_UNUSED, int *win) { + bool reg_offset_p = reg_offset_addressing_ok_p (mode); + /* We must recognize output that we have already generated ourselves. */ if (GET_CODE (x) == PLUS && GET_CODE (XEXP (x, 0)) == PLUS @@ -4004,11 +5347,8 @@ rs6000_legitimize_reload_address (rtx x, && GET_CODE (XEXP (x, 0)) == PLUS && XEXP (XEXP (x, 0), 0) == pic_offset_table_rtx && GET_CODE (XEXP (XEXP (x, 0), 1)) == HIGH - && GET_CODE (XEXP (XEXP (XEXP (x, 0), 1), 0)) == CONST && XEXP (XEXP (XEXP (x, 0), 1), 0) == XEXP (x, 1) - && GET_CODE (XEXP (XEXP (x, 1), 0)) == MINUS - && GET_CODE (XEXP (XEXP (XEXP (x, 1), 0), 0)) == SYMBOL_REF - && GET_CODE (XEXP (XEXP (XEXP (x, 1), 0), 1)) == SYMBOL_REF) + && machopic_operand_p (XEXP (x, 1))) { /* Result of previous invocation of this function on Darwin floating point constant. */ @@ -4025,10 +5365,11 @@ rs6000_legitimize_reload_address (rtx x, if (GET_CODE (x) == PLUS && GET_CODE (XEXP (x, 0)) == REG && REGNO (XEXP (x, 0)) < 32 - && REG_MODE_OK_FOR_BASE_P (XEXP (x, 0), mode) + && INT_REG_OK_FOR_BASE_P (XEXP (x, 0), 1) && GET_CODE (XEXP (x, 1)) == CONST_INT + && reg_offset_p && (INTVAL (XEXP (x, 1)) & 3) != 0 - && !ALTIVEC_VECTOR_MODE (mode) + && VECTOR_MEM_NONE_P (mode) && GET_MODE_SIZE (mode) >= UNITS_PER_WORD && TARGET_POWERPC64) { @@ -4043,13 +5384,14 @@ rs6000_legitimize_reload_address (rtx x, if (GET_CODE (x) == PLUS && GET_CODE (XEXP (x, 0)) == REG && REGNO (XEXP (x, 0)) < FIRST_PSEUDO_REGISTER - && REG_MODE_OK_FOR_BASE_P (XEXP (x, 0), mode) + && INT_REG_OK_FOR_BASE_P (XEXP (x, 0), 1) && GET_CODE (XEXP (x, 1)) == CONST_INT + && reg_offset_p && !SPE_VECTOR_MODE (mode) && !(TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode || mode == DDmode || mode == TDmode || mode == DImode)) - && !ALTIVEC_VECTOR_MODE (mode)) + && VECTOR_MEM_NONE_P (mode)) { HOST_WIDE_INT val = INTVAL (XEXP (x, 1)); HOST_WIDE_INT low = ((val & 0xffff) ^ 0x8000) - 0x8000; @@ -4079,7 +5421,8 @@ rs6000_legitimize_reload_address (rtx x, } if (GET_CODE (x) == SYMBOL_REF - && !ALTIVEC_VECTOR_MODE (mode) + && reg_offset_p + && VECTOR_MEM_NONE_P (mode) && !SPE_VECTOR_MODE (mode) #if TARGET_MACHO && DEFAULT_ABI == ABI_DARWIN @@ -4095,14 +5438,12 @@ rs6000_legitimize_reload_address (rtx x, && mode != TDmode && (mode != DImode || TARGET_POWERPC64) && ((mode != DFmode && mode != DDmode) || TARGET_POWERPC64 - || (TARGET_FPRS && TARGET_HARD_FLOAT))) + || (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT))) { #if TARGET_MACHO if (flag_pic) { - rtx offset = gen_rtx_CONST (Pmode, - gen_rtx_MINUS (Pmode, x, - machopic_function_base_sym ())); + rtx offset = machopic_gen_offset (x); x = gen_rtx_LO_SUM (GET_MODE (x), gen_rtx_PLUS (Pmode, pic_offset_table_rtx, gen_rtx_HIGH (Pmode, offset)), offset); @@ -4121,9 +5462,11 @@ rs6000_legitimize_reload_address (rtx x, /* Reload an offset address wrapped by an AND that represents the masking of the lower bits. Strip the outer AND and let reload - convert the offset address into an indirect address. */ - if (TARGET_ALTIVEC - && ALTIVEC_VECTOR_MODE (mode) + convert the offset address into an indirect address. For VSX, + force reload to create the address with an AND in a separate + register, because we can't guarantee an altivec register will + be used. */ + if (VECTOR_MEM_ALTIVEC_P (mode) && GET_CODE (x) == AND && GET_CODE (XEXP (x, 0)) == PLUS && GET_CODE (XEXP (XEXP (x, 0), 0)) == REG @@ -4137,6 +5480,8 @@ rs6000_legitimize_reload_address (rtx x, } if (TARGET_TOC + && reg_offset_p + && GET_CODE (x) == SYMBOL_REF && constant_pool_expr_p (x) && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), mode)) { @@ -4148,6 +5493,33 @@ rs6000_legitimize_reload_address (rtx x, return x; } +/* Debug version of rs6000_legitimize_reload_address. */ +static rtx +rs6000_debug_legitimize_reload_address (rtx x, enum machine_mode mode, + int opnum, int type, + int ind_levels, int *win) +{ + rtx ret = rs6000_legitimize_reload_address (x, mode, opnum, type, + ind_levels, win); + fprintf (stderr, + "\nrs6000_legitimize_reload_address: mode = %s, opnum = %d, " + "type = %d, ind_levels = %d, win = %d, original addr:\n", + GET_MODE_NAME (mode), opnum, type, ind_levels, *win); + debug_rtx (x); + + if (x == ret) + fprintf (stderr, "Same address returned\n"); + else if (!ret) + fprintf (stderr, "NULL returned\n"); + else + { + fprintf (stderr, "New address:\n"); + debug_rtx (ret); + } + + return ret; +} + /* GO_IF_LEGITIMATE_ADDRESS recognizes an RTL expression that is a valid memory address for an instruction. The MODE argument is the machine mode for the MEM expression @@ -4165,12 +5537,13 @@ rs6000_legitimize_reload_address (rtx x, 32-bit DImode, TImode, TFmode, TDmode), indexed addressing cannot be used because adjacent memory cells are accessed by adding word-sized offsets during assembly output. */ -int -rs6000_legitimate_address (enum machine_mode mode, rtx x, int reg_ok_strict) +bool +rs6000_legitimate_address_p (enum machine_mode mode, rtx x, bool reg_ok_strict) { + bool reg_offset_p = reg_offset_addressing_ok_p (mode); + /* If this is an unaligned stvx/ldvx type address, discard the outer AND. */ - if (TARGET_ALTIVEC - && ALTIVEC_VECTOR_MODE (mode) + if (VECTOR_MEM_ALTIVEC_P (mode) && GET_CODE (x) == AND && GET_CODE (XEXP (x, 1)) == CONST_INT && INTVAL (XEXP (x, 1)) == -16) @@ -4181,7 +5554,7 @@ rs6000_legitimate_address (enum machine_ if (legitimate_indirect_address_p (x, reg_ok_strict)) return 1; if ((GET_CODE (x) == PRE_INC || GET_CODE (x) == PRE_DEC) - && !ALTIVEC_VECTOR_MODE (mode) + && !VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) && !SPE_VECTOR_MODE (mode) && mode != TFmode && mode != TDmode @@ -4191,12 +5564,15 @@ rs6000_legitimate_address (enum machine_ && TARGET_UPDATE && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict)) return 1; - if (legitimate_small_data_p (mode, x)) + if (virtual_stack_registers_memory_p (x)) return 1; - if (legitimate_constant_pool_address_p (x)) + if (reg_offset_p && legitimate_small_data_p (mode, x)) + return 1; + if (reg_offset_p && legitimate_constant_pool_address_p (x)) return 1; /* If not REG_OK_STRICT (before reload) let pass any stack offset. */ if (! reg_ok_strict + && reg_offset_p && GET_CODE (x) == PLUS && GET_CODE (XEXP (x, 0)) == REG && (XEXP (x, 0) == virtual_stack_vars_rtx @@ -4208,21 +5584,23 @@ rs6000_legitimate_address (enum machine_ if (mode != TImode && mode != TFmode && mode != TDmode - && ((TARGET_HARD_FLOAT && TARGET_FPRS) + && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_POWERPC64 - || ((mode != DFmode && mode != DDmode) || TARGET_E500_DOUBLE)) + || (mode != DFmode && mode != DDmode) + || (TARGET_E500_DOUBLE && mode != DDmode)) && (TARGET_POWERPC64 || mode != DImode) + && !avoiding_indexed_address_p (mode) && legitimate_indexed_address_p (x, reg_ok_strict)) return 1; if (GET_CODE (x) == PRE_MODIFY && mode != TImode && mode != TFmode && mode != TDmode - && ((TARGET_HARD_FLOAT && TARGET_FPRS) + && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_POWERPC64 || ((mode != DFmode && mode != DDmode) || TARGET_E500_DOUBLE)) && (TARGET_POWERPC64 || mode != DImode) - && !ALTIVEC_VECTOR_MODE (mode) + && !VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) && !SPE_VECTOR_MODE (mode) /* Restrict addressing for DI because of our SUBREG hackery. */ && !(TARGET_E500_DOUBLE @@ -4230,32 +5608,59 @@ rs6000_legitimate_address (enum machine_ && TARGET_UPDATE && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict) && (rs6000_legitimate_offset_address_p (mode, XEXP (x, 1), reg_ok_strict) - || legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict)) + || (!avoiding_indexed_address_p (mode) + && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict))) && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0))) return 1; - if (legitimate_lo_sum_address_p (mode, x, reg_ok_strict)) + if (reg_offset_p && legitimate_lo_sum_address_p (mode, x, reg_ok_strict)) return 1; return 0; } +/* Debug version of rs6000_legitimate_address_p. */ +static bool +rs6000_debug_legitimate_address_p (enum machine_mode mode, rtx x, + bool reg_ok_strict) +{ + bool ret = rs6000_legitimate_address_p (mode, x, reg_ok_strict); + fprintf (stderr, + "\nrs6000_legitimate_address_p: return = %s, mode = %s, " + "strict = %d, code = %s\n", + ret ? "true" : "false", + GET_MODE_NAME (mode), + reg_ok_strict, + GET_RTX_NAME (GET_CODE (x))); + debug_rtx (x); + + return ret; +} + /* Go to LABEL if ADDR (a legitimate address expression) has an effect that depends on the machine mode it is used for. On the RS/6000 this is true of all integral offsets (since AltiVec - modes don't allow them) or is a pre-increment or decrement. + and VSX modes don't allow them) or is a pre-increment or decrement. ??? Except that due to conceptual problems in offsettable_address_p we can't really report the problems of integral offsets. So leave this assuming that the adjustable offset must be valid for the sub-words of a TFmode operand, which is what we had before. */ -bool +static bool rs6000_mode_dependent_address (rtx addr) { switch (GET_CODE (addr)) { case PLUS: - if (GET_CODE (XEXP (addr, 1)) == CONST_INT) + /* Any offset from virtual_stack_vars_rtx and arg_pointer_rtx + is considered a legitimate address before reload, so there + are no offset restrictions in that case. Note that this + condition is safe in strict mode because any address involving + virtual_stack_vars_rtx or arg_pointer_rtx would already have + been rejected as illegitimate. */ + if (XEXP (addr, 0) != virtual_stack_vars_rtx + && XEXP (addr, 0) != arg_pointer_rtx + && GET_CODE (XEXP (addr, 1)) == CONST_INT) { unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1)); return val + 12 + 0x8000 >= 0x10000; @@ -4269,6 +5674,10 @@ rs6000_mode_dependent_address (rtx addr) case PRE_MODIFY: return TARGET_UPDATE; + /* AND is only allowed in Altivec loads. */ + case AND: + return true; + default: break; } @@ -4276,6 +5685,40 @@ rs6000_mode_dependent_address (rtx addr) return false; } +/* Debug version of rs6000_mode_dependent_address. */ +static bool +rs6000_debug_mode_dependent_address (rtx addr) +{ + bool ret = rs6000_mode_dependent_address (addr); + + fprintf (stderr, "\nrs6000_mode_dependent_address: ret = %s\n", + ret ? "true" : "false"); + debug_rtx (addr); + + return ret; +} + +/* Implement FIND_BASE_TERM. */ + +rtx +rs6000_find_base_term (rtx op) +{ + rtx base, offset; + + split_const (op, &base, &offset); + if (GET_CODE (base) == UNSPEC) + switch (XINT (base, 1)) + { + case UNSPEC_TOCREL: + case UNSPEC_MACHOPIC_OFFSET: + /* OP represents SYM [+ OFFSET] - ANCHOR. SYM is the base term + for aliasing purposes. */ + return XVECEXP (base, 0, 0); + } + + return op; +} + /* More elaborate version of recog's offsettable_memref_p predicate that works around the ??? note of rs6000_mode_dependent_address. In particular it accepts @@ -4303,42 +5746,6 @@ rs6000_offsettable_memref_p (rtx op) return rs6000_legitimate_offset_address_p (GET_MODE (op), XEXP (op, 0), 1); } -/* Return number of consecutive hard regs needed starting at reg REGNO - to hold something of mode MODE. - This is ordinarily the length in words of a value of mode MODE - but can be less for certain modes in special long registers. - - For the SPE, GPRs are 64 bits but only 32 bits are visible in - scalar instructions. The upper 32 bits are only available to the - SIMD instructions. - - POWER and PowerPC GPRs hold 32 bits worth; - PowerPC64 GPRs and FPRs point register holds 64 bits worth. */ - -int -rs6000_hard_regno_nregs (int regno, enum machine_mode mode) -{ - if (FP_REGNO_P (regno)) - return (GET_MODE_SIZE (mode) + UNITS_PER_FP_WORD - 1) / UNITS_PER_FP_WORD; - - if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode)) - return (GET_MODE_SIZE (mode) + UNITS_PER_SPE_WORD - 1) / UNITS_PER_SPE_WORD; - - if (ALTIVEC_REGNO_P (regno)) - return - (GET_MODE_SIZE (mode) + UNITS_PER_ALTIVEC_WORD - 1) / UNITS_PER_ALTIVEC_WORD; - - /* The value returned for SCmode in the E500 double case is 2 for - ABI compatibility; storing an SCmode value in a single register - would require function_arg and rs6000_spe_function_arg to handle - SCmode so as to pass the value correctly in a pair of - registers. */ - if (TARGET_E500_DOUBLE && FLOAT_MODE_P (mode) && mode != SCmode) - return (GET_MODE_SIZE (mode) + UNITS_PER_FP_WORD - 1) / UNITS_PER_FP_WORD; - - return (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD; -} - /* Change register usage conditional on target flags. */ void rs6000_conditional_register_usage (void) @@ -4403,14 +5810,14 @@ rs6000_conditional_register_usage (void) = call_really_used_regs[14] = 1; } - if (!TARGET_ALTIVEC) + if (!TARGET_ALTIVEC && !TARGET_VSX) { for (i = FIRST_ALTIVEC_REGNO; i <= LAST_ALTIVEC_REGNO; ++i) fixed_regs[i] = call_used_regs[i] = call_really_used_regs[i] = 1; call_really_used_regs[VRSAVE_REGNO] = 1; } - if (TARGET_ALTIVEC) + if (TARGET_ALTIVEC || TARGET_VSX) global_regs[VSCR_REGNO] = 1; if (TARGET_ALTIVEC_ABI) @@ -4628,6 +6035,20 @@ rs6000_emit_move (rtx dest, rtx source, operands[0] = dest; operands[1] = source; + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, + "\nrs6000_emit_move: mode = %s, reload_in_progress = %d, " + "reload_completed = %d, can_create_pseudos = %d.\ndest:\n", + GET_MODE_NAME (mode), + reload_in_progress, + reload_completed, + can_create_pseudo_p ()); + debug_rtx (dest); + fprintf (stderr, "source:\n"); + debug_rtx (source); + } + /* Sanity checks. Check that we get CONST_DOUBLE only when we should. */ if (GET_CODE (operands[1]) == CONST_DOUBLE && ! FLOAT_MODE_P (mode) @@ -4667,12 +6088,38 @@ rs6000_emit_move (rtx dest, rtx source, return; } + /* Fix up invalid (const (plus (symbol_ref) (reg))) that seems to be created + in the secondary_reload phase, which evidently overwrites the CONST_INT + with a register. */ + if (GET_CODE (source) == CONST && GET_CODE (XEXP (source, 0)) == PLUS + && mode == Pmode) + { + rtx add_op0 = XEXP (XEXP (source, 0), 0); + rtx add_op1 = XEXP (XEXP (source, 0), 1); + + if (GET_CODE (add_op0) == SYMBOL_REF && GET_CODE (add_op1) == REG) + { + rtx tmp = (can_create_pseudo_p ()) ? gen_reg_rtx (Pmode) : dest; + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, "\nrs6000_emit_move: bad source\n"); + debug_rtx (source); + } + + rs6000_emit_move (tmp, add_op0, Pmode); + emit_insn (gen_rtx_SET (VOIDmode, dest, + gen_rtx_PLUS (Pmode, tmp, add_op1))); + return; + } + } + if (can_create_pseudo_p () && GET_CODE (operands[0]) == MEM && !gpc_reg_operand (operands[1], mode)) operands[1] = force_reg (mode, operands[1]); if (mode == SFmode && ! TARGET_POWERPC - && TARGET_HARD_FLOAT && TARGET_FPRS + && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && GET_CODE (operands[0]) == MEM) { int regnum; @@ -4832,6 +6279,8 @@ rs6000_emit_move (rtx dest, rtx source, case V2SFmode: case V2SImode: case V1DImode: + case V2DFmode: + case V2DImode: if (CONSTANT_P (operands[1]) && !easy_vector_constant (operands[1], mode)) operands[1] = force_const_mem (mode, operands[1]); @@ -4937,14 +6386,6 @@ rs6000_emit_move (rtx dest, rtx source, && ! legitimate_constant_pool_address_p (operands[1]) && ! toc_relative_expr_p (operands[1])) { - /* Emit a USE operation so that the constant isn't deleted if - expensive optimizations are turned on because nobody - references it. This should only be done for operands that - contain SYMBOL_REFs with CONSTANT_POOL_ADDRESS_P set. - This should not be done for operands that contain LABEL_REFs. - For now, we just handle the obvious case. */ - if (GET_CODE (operands[1]) != LABEL_REF) - emit_insn (gen_rtx_USE (VOIDmode, operands[1])); #if TARGET_MACHO /* Darwin uses a special PIC legitimizer. */ @@ -4976,16 +6417,14 @@ rs6000_emit_move (rtx dest, rtx source, rtx other = XEXP (XEXP (operands[1], 0), 1); sym = force_reg (mode, sym); - if (mode == SImode) - emit_insn (gen_addsi3 (operands[0], sym, other)); - else - emit_insn (gen_adddi3 (operands[0], sym, other)); + emit_insn (gen_add3_insn (operands[0], sym, other)); return; } operands[1] = force_const_mem (mode, operands[1]); if (TARGET_TOC + && GET_CODE (XEXP (operands[1], 0)) == SYMBOL_REF && constant_pool_expr_p (XEXP (operands[1], 0)) && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P ( get_pool_constant (XEXP (operands[1], 0)), @@ -5015,7 +6454,7 @@ rs6000_emit_move (rtx dest, rtx source, break; default: - gcc_unreachable (); + fatal_insn ("bad move", gen_rtx_SET (VOIDmode, dest, source)); } /* Above, we may have called force_const_mem which may have returned @@ -5035,10 +6474,10 @@ rs6000_emit_move (rtx dest, rtx source, && TARGET_HARD_FLOAT && TARGET_FPRS) /* Nonzero if we can use an AltiVec register to pass this arg. */ -#define USE_ALTIVEC_FOR_ARG_P(CUM,MODE,TYPE,NAMED) \ - (ALTIVEC_VECTOR_MODE (MODE) \ - && (CUM)->vregno <= ALTIVEC_ARG_MAX_REG \ - && TARGET_ALTIVEC_ABI \ +#define USE_ALTIVEC_FOR_ARG_P(CUM,MODE,TYPE,NAMED) \ + ((ALTIVEC_VECTOR_MODE (MODE) || VSX_VECTOR_MODE (MODE)) \ + && (CUM)->vregno <= ALTIVEC_ARG_MAX_REG \ + && TARGET_ALTIVEC_ABI \ && (NAMED)) /* Return a nonzero value to say to return the function value in @@ -5279,7 +6718,7 @@ function_arg_boundary (enum machine_mode && int_size_in_bytes (type) >= 8 && int_size_in_bytes (type) < 16)) return 64; - else if (ALTIVEC_VECTOR_MODE (mode) + else if ((ALTIVEC_VECTOR_MODE (mode) || VSX_VECTOR_MODE (mode)) || (type && TREE_CODE (type) == VECTOR_TYPE && int_size_in_bytes (type) >= 16)) return 128; @@ -5425,6 +6864,7 @@ function_arg_advance (CUMULATIVE_ARGS *c if (TARGET_ALTIVEC_ABI && (ALTIVEC_VECTOR_MODE (mode) + || VSX_VECTOR_MODE (mode) || (type && TREE_CODE (type) == VECTOR_TYPE && int_size_in_bytes (type) == 16))) { @@ -5509,9 +6949,10 @@ function_arg_advance (CUMULATIVE_ARGS *c else if (DEFAULT_ABI == ABI_V4) { if (TARGET_HARD_FLOAT && TARGET_FPRS - && (mode == SFmode || mode == DFmode - || mode == SDmode || mode == DDmode || mode == TDmode - || (mode == TFmode && !TARGET_IEEEQUAD))) + && ((TARGET_SINGLE_FLOAT && mode == SFmode) + || (TARGET_DOUBLE_FLOAT && mode == DFmode) + || (mode == TFmode && !TARGET_IEEEQUAD) + || mode == SDmode || mode == DDmode || mode == TDmode)) { /* _Decimal128 must use an even/odd register pair. This assumes that the register number is odd when fregno is odd. */ @@ -5607,14 +7048,12 @@ spe_build_register_parallel (enum machin switch (mode) { case DFmode: - case DDmode: r1 = gen_rtx_REG (DImode, gregno); r1 = gen_rtx_EXPR_LIST (VOIDmode, r1, const0_rtx); return gen_rtx_PARALLEL (mode, gen_rtvec (1, r1)); case DCmode: case TFmode: - case TDmode: r1 = gen_rtx_REG (DImode, gregno); r1 = gen_rtx_EXPR_LIST (VOIDmode, r1, const0_rtx); r3 = gen_rtx_REG (DImode, gregno + 2); @@ -5647,13 +7086,12 @@ rs6000_spe_function_arg (CUMULATIVE_ARGS /* On E500 v2, double arithmetic is done on the full 64-bit GPR, but are passed and returned in a pair of GPRs for ABI compatibility. */ if (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode - || mode == DDmode || mode == TDmode || mode == DCmode || mode == TCmode)) { int n_words = rs6000_arg_size (mode, type); /* Doubles go in an odd/even register pair (r5/r6, etc). */ - if (mode == DFmode || mode == DDmode) + if (mode == DFmode) gregno += (1 - gregno) & 1; /* Multi-reg args are not split between registers and stack. */ @@ -6021,6 +7459,7 @@ function_arg (CUMULATIVE_ARGS *cum, enum return gen_rtx_REG (mode, cum->vregno); else if (TARGET_ALTIVEC_ABI && (ALTIVEC_VECTOR_MODE (mode) + || VSX_VECTOR_MODE (mode) || (type && TREE_CODE (type) == VECTOR_TYPE && int_size_in_bytes (type) == 16))) { @@ -6066,17 +7505,16 @@ function_arg (CUMULATIVE_ARGS *cum, enum else if (TARGET_SPE_ABI && TARGET_SPE && (SPE_VECTOR_MODE (mode) || (TARGET_E500_DOUBLE && (mode == DFmode - || mode == DDmode || mode == DCmode || mode == TFmode - || mode == TDmode || mode == TCmode)))) return rs6000_spe_function_arg (cum, mode, type); else if (abi == ABI_V4) { if (TARGET_HARD_FLOAT && TARGET_FPRS - && (mode == SFmode || mode == DFmode + && ((TARGET_SINGLE_FLOAT && mode == SFmode) + || (TARGET_DOUBLE_FLOAT && mode == DFmode) || (mode == TFmode && !TARGET_IEEEQUAD) || mode == SDmode || mode == DDmode || mode == TDmode)) { @@ -6538,11 +7976,17 @@ setup_incoming_varargs (CUMULATIVE_ARGS fregno <= FP_ARG_V4_MAX_REG && nregs < cfun->va_list_fpr_size; fregno++, off += UNITS_PER_FP_WORD, nregs++) { - mem = gen_rtx_MEM (DFmode, plus_constant (save_area, off)); - MEM_NOTRAP_P (mem) = 1; - set_mem_alias_set (mem, set); - set_mem_align (mem, GET_MODE_ALIGNMENT (DFmode)); - emit_move_insn (mem, gen_rtx_REG (DFmode, fregno)); + mem = gen_rtx_MEM ((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) + ? DFmode : SFmode, + plus_constant (save_area, off)); + MEM_NOTRAP_P (mem) = 1; + set_mem_alias_set (mem, set); + set_mem_align (mem, GET_MODE_ALIGNMENT ( + (TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) + ? DFmode : SFmode)); + emit_move_insn (mem, gen_rtx_REG ( + (TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) + ? DFmode : SFmode, fregno)); } emit_label (lab); @@ -6624,9 +8068,12 @@ rs6000_va_start (tree valist, rtx nextar valist = build_va_arg_indirect_ref (valist); gpr = build3 (COMPONENT_REF, TREE_TYPE (f_gpr), valist, f_gpr, NULL_TREE); - fpr = build3 (COMPONENT_REF, TREE_TYPE (f_fpr), valist, f_fpr, NULL_TREE); - ovf = build3 (COMPONENT_REF, TREE_TYPE (f_ovf), valist, f_ovf, NULL_TREE); - sav = build3 (COMPONENT_REF, TREE_TYPE (f_sav), valist, f_sav, NULL_TREE); + fpr = build3 (COMPONENT_REF, TREE_TYPE (f_fpr), unshare_expr (valist), + f_fpr, NULL_TREE); + ovf = build3 (COMPONENT_REF, TREE_TYPE (f_ovf), unshare_expr (valist), + f_ovf, NULL_TREE); + sav = build3 (COMPONENT_REF, TREE_TYPE (f_sav), unshare_expr (valist), + f_sav, NULL_TREE); /* Count number of gp and fp argument registers used. */ words = current_function_args_info.words; @@ -6740,27 +8187,31 @@ rs6000_gimplify_va_arg (tree valist, tre valist = build_va_arg_indirect_ref (valist); gpr = build3 (COMPONENT_REF, TREE_TYPE (f_gpr), valist, f_gpr, NULL_TREE); - fpr = build3 (COMPONENT_REF, TREE_TYPE (f_fpr), valist, f_fpr, NULL_TREE); - ovf = build3 (COMPONENT_REF, TREE_TYPE (f_ovf), valist, f_ovf, NULL_TREE); - sav = build3 (COMPONENT_REF, TREE_TYPE (f_sav), valist, f_sav, NULL_TREE); + fpr = build3 (COMPONENT_REF, TREE_TYPE (f_fpr), unshare_expr (valist), + f_fpr, NULL_TREE); + ovf = build3 (COMPONENT_REF, TREE_TYPE (f_ovf), unshare_expr (valist), + f_ovf, NULL_TREE); + sav = build3 (COMPONENT_REF, TREE_TYPE (f_sav), unshare_expr (valist), + f_sav, NULL_TREE); size = int_size_in_bytes (type); rsize = (size + 3) / 4; align = 1; if (TARGET_HARD_FLOAT && TARGET_FPRS - && (TYPE_MODE (type) == SFmode - || TYPE_MODE (type) == DFmode - || TYPE_MODE (type) == TFmode - || TYPE_MODE (type) == SDmode - || TYPE_MODE (type) == DDmode - || TYPE_MODE (type) == TDmode)) + && ((TARGET_SINGLE_FLOAT && TYPE_MODE (type) == SFmode) + || (TARGET_DOUBLE_FLOAT + && (TYPE_MODE (type) == DFmode + || TYPE_MODE (type) == TFmode + || TYPE_MODE (type) == SDmode + || TYPE_MODE (type) == DDmode + || TYPE_MODE (type) == TDmode)))) { /* FP args go in FP registers, if present. */ reg = fpr; n_reg = (size + 7) / 8; - sav_ofs = 8*4; - sav_scale = 8; + sav_ofs = ((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) ? 8 : 4) * 4; + sav_scale = ((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) ? 8 : 4); if (TYPE_MODE (type) != SFmode && TYPE_MODE (type) != SDmode) align = 8; } @@ -6796,18 +8247,19 @@ rs6000_gimplify_va_arg (tree valist, tre if (n_reg == 2 && reg == gpr) { regalign = 1; - u = build2 (BIT_AND_EXPR, TREE_TYPE (reg), reg, + u = build2 (BIT_AND_EXPR, TREE_TYPE (reg), unshare_expr (reg), build_int_cst (TREE_TYPE (reg), n_reg - 1)); - u = build2 (POSTINCREMENT_EXPR, TREE_TYPE (reg), reg, u); + u = build2 (POSTINCREMENT_EXPR, TREE_TYPE (reg), + unshare_expr (reg), u); } /* _Decimal128 is passed in even/odd fpr pairs; the stored reg number is 0 for f1, so we want to make it odd. */ else if (reg == fpr && TYPE_MODE (type) == TDmode) { regalign = 1; - t = build2 (BIT_IOR_EXPR, TREE_TYPE (reg), reg, + t = build2 (BIT_IOR_EXPR, TREE_TYPE (reg), unshare_expr (reg), build_int_cst (TREE_TYPE (reg), 1)); - u = build2 (MODIFY_EXPR, void_type_node, reg, t); + u = build2 (MODIFY_EXPR, void_type_node, unshare_expr (reg), t); } t = fold_convert (TREE_TYPE (reg), size_int (8 - n_reg + 1)); @@ -6820,7 +8272,7 @@ rs6000_gimplify_va_arg (tree valist, tre if (sav_ofs) t = build2 (POINTER_PLUS_EXPR, ptr_type_node, sav, size_int (sav_ofs)); - u = build2 (POSTINCREMENT_EXPR, TREE_TYPE (reg), reg, + u = build2 (POSTINCREMENT_EXPR, TREE_TYPE (reg), unshare_expr (reg), build_int_cst (TREE_TYPE (reg), n_reg)); u = fold_convert (sizetype, u); u = build2 (MULT_EXPR, sizetype, u, size_int (sav_scale)); @@ -6907,12 +8359,54 @@ def_builtin (int mask, const char *name, { if ((mask & target_flags) || TARGET_PAIRED_FLOAT) { + tree t; if (rs6000_builtin_decls[code]) - abort (); + fatal_error ("internal error: builtin function to %s already processed.", + name); - rs6000_builtin_decls[code] = + rs6000_builtin_decls[code] = t = add_builtin_function (name, type, code, BUILT_IN_MD, NULL, NULL_TREE); + + gcc_assert (code >= 0 && code < (int)MAX_RS6000_BUILTINS); + switch (builtin_classify[code]) + { + default: + gcc_unreachable (); + + /* assume builtin can do anything. */ + case RS6000_BTC_MISC: + break; + + /* const function, function only depends on the inputs. */ + case RS6000_BTC_CONST: + TREE_CONSTANT (t) = 1; + TREE_NOTHROW (t) = 1; + break; + + /* pure function, function can read global memory. */ + case RS6000_BTC_PURE: + DECL_IS_PURE (t) = 1; + TREE_NOTHROW (t) = 1; + break; + + /* Function is a math function. If rounding mode is on, then treat + the function as not reading global memory, but it can have + arbitrary side effects. If it is off, then assume the function is + a const function. This mimics the ATTR_MATHFN_FPROUNDING + attribute in builtin-attribute.def that is used for the math + functions. */ + case RS6000_BTC_FP_PURE: + TREE_NOTHROW (t) = 1; + if (flag_rounding_math) + { + DECL_IS_PURE (t) = 1; + DECL_IS_NOVOPS (t) = 1; + } + else + TREE_CONSTANT (t) = 1; + break; + } } } @@ -6931,14 +8425,26 @@ static const struct builtin_description { MASK_ALTIVEC, CODE_FOR_altivec_vmsumuhs, "__builtin_altivec_vmsumuhs", ALTIVEC_BUILTIN_VMSUMUHS }, { MASK_ALTIVEC, CODE_FOR_altivec_vmsumshs, "__builtin_altivec_vmsumshs", ALTIVEC_BUILTIN_VMSUMSHS }, { MASK_ALTIVEC, CODE_FOR_altivec_vnmsubfp, "__builtin_altivec_vnmsubfp", ALTIVEC_BUILTIN_VNMSUBFP }, + { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v2df, "__builtin_altivec_vperm_2df", ALTIVEC_BUILTIN_VPERM_2DF }, + { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v2di, "__builtin_altivec_vperm_2di", ALTIVEC_BUILTIN_VPERM_2DI }, { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v4sf, "__builtin_altivec_vperm_4sf", ALTIVEC_BUILTIN_VPERM_4SF }, { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v4si, "__builtin_altivec_vperm_4si", ALTIVEC_BUILTIN_VPERM_4SI }, { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v8hi, "__builtin_altivec_vperm_8hi", ALTIVEC_BUILTIN_VPERM_8HI }, - { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v16qi, "__builtin_altivec_vperm_16qi", ALTIVEC_BUILTIN_VPERM_16QI }, - { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v4sf, "__builtin_altivec_vsel_4sf", ALTIVEC_BUILTIN_VSEL_4SF }, - { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v4si, "__builtin_altivec_vsel_4si", ALTIVEC_BUILTIN_VSEL_4SI }, - { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v8hi, "__builtin_altivec_vsel_8hi", ALTIVEC_BUILTIN_VSEL_8HI }, - { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v16qi, "__builtin_altivec_vsel_16qi", ALTIVEC_BUILTIN_VSEL_16QI }, + { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v16qi_uns, "__builtin_altivec_vperm_16qi", ALTIVEC_BUILTIN_VPERM_16QI }, + { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v2di_uns, "__builtin_altivec_vperm_2di_uns", ALTIVEC_BUILTIN_VPERM_2DI_UNS }, + { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v4si_uns, "__builtin_altivec_vperm_4si_uns", ALTIVEC_BUILTIN_VPERM_4SI_UNS }, + { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v8hi_uns, "__builtin_altivec_vperm_8hi_uns", ALTIVEC_BUILTIN_VPERM_8HI_UNS }, + { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v16qi_uns, "__builtin_altivec_vperm_16qi_uns", ALTIVEC_BUILTIN_VPERM_16QI_UNS }, + { MASK_ALTIVEC, CODE_FOR_vector_select_v4sf, "__builtin_altivec_vsel_4sf", ALTIVEC_BUILTIN_VSEL_4SF }, + { MASK_ALTIVEC, CODE_FOR_vector_select_v4si, "__builtin_altivec_vsel_4si", ALTIVEC_BUILTIN_VSEL_4SI }, + { MASK_ALTIVEC, CODE_FOR_vector_select_v8hi, "__builtin_altivec_vsel_8hi", ALTIVEC_BUILTIN_VSEL_8HI }, + { MASK_ALTIVEC, CODE_FOR_vector_select_v16qi, "__builtin_altivec_vsel_16qi", ALTIVEC_BUILTIN_VSEL_16QI }, + { MASK_ALTIVEC, CODE_FOR_vector_select_v2df, "__builtin_altivec_vsel_2df", ALTIVEC_BUILTIN_VSEL_2DF }, + { MASK_ALTIVEC, CODE_FOR_vector_select_v2di, "__builtin_altivec_vsel_2di", ALTIVEC_BUILTIN_VSEL_2DI }, + { MASK_ALTIVEC, CODE_FOR_vector_select_v4si_uns, "__builtin_altivec_vsel_4si_uns", ALTIVEC_BUILTIN_VSEL_4SI_UNS }, + { MASK_ALTIVEC, CODE_FOR_vector_select_v8hi_uns, "__builtin_altivec_vsel_8hi_uns", ALTIVEC_BUILTIN_VSEL_8HI_UNS }, + { MASK_ALTIVEC, CODE_FOR_vector_select_v16qi_uns, "__builtin_altivec_vsel_16qi_uns", ALTIVEC_BUILTIN_VSEL_16QI_UNS }, + { MASK_ALTIVEC, CODE_FOR_vector_select_v2di_uns, "__builtin_altivec_vsel_2di_uns", ALTIVEC_BUILTIN_VSEL_2DI_UNS }, { MASK_ALTIVEC, CODE_FOR_altivec_vsldoi_v16qi, "__builtin_altivec_vsldoi_16qi", ALTIVEC_BUILTIN_VSLDOI_16QI }, { MASK_ALTIVEC, CODE_FOR_altivec_vsldoi_v8hi, "__builtin_altivec_vsldoi_8hi", ALTIVEC_BUILTIN_VSLDOI_8HI }, { MASK_ALTIVEC, CODE_FOR_altivec_vsldoi_v4si, "__builtin_altivec_vsldoi_4si", ALTIVEC_BUILTIN_VSLDOI_4SI }, @@ -6960,6 +8466,59 @@ static const struct builtin_description { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_perm", ALTIVEC_BUILTIN_VEC_PERM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_sel", ALTIVEC_BUILTIN_VEC_SEL }, + { MASK_VSX, CODE_FOR_vsx_fmaddv2df4, "__builtin_vsx_xvmadddp", VSX_BUILTIN_XVMADDDP }, + { MASK_VSX, CODE_FOR_vsx_fmsubv2df4, "__builtin_vsx_xvmsubdp", VSX_BUILTIN_XVMSUBDP }, + { MASK_VSX, CODE_FOR_vsx_fnmaddv2df4, "__builtin_vsx_xvnmadddp", VSX_BUILTIN_XVNMADDDP }, + { MASK_VSX, CODE_FOR_vsx_fnmsubv2df4, "__builtin_vsx_xvnmsubdp", VSX_BUILTIN_XVNMSUBDP }, + + { MASK_VSX, CODE_FOR_vsx_fmaddv4sf4, "__builtin_vsx_xvmaddsp", VSX_BUILTIN_XVMADDSP }, + { MASK_VSX, CODE_FOR_vsx_fmsubv4sf4, "__builtin_vsx_xvmsubsp", VSX_BUILTIN_XVMSUBSP }, + { MASK_VSX, CODE_FOR_vsx_fnmaddv4sf4, "__builtin_vsx_xvnmaddsp", VSX_BUILTIN_XVNMADDSP }, + { MASK_VSX, CODE_FOR_vsx_fnmsubv4sf4, "__builtin_vsx_xvnmsubsp", VSX_BUILTIN_XVNMSUBSP }, + + { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_msub", VSX_BUILTIN_VEC_MSUB }, + { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_nmadd", VSX_BUILTIN_VEC_NMADD }, + + { MASK_VSX, CODE_FOR_vector_select_v2di, "__builtin_vsx_xxsel_2di", VSX_BUILTIN_XXSEL_2DI }, + { MASK_VSX, CODE_FOR_vector_select_v2df, "__builtin_vsx_xxsel_2df", VSX_BUILTIN_XXSEL_2DF }, + { MASK_VSX, CODE_FOR_vector_select_v4sf, "__builtin_vsx_xxsel_4sf", VSX_BUILTIN_XXSEL_4SF }, + { MASK_VSX, CODE_FOR_vector_select_v4si, "__builtin_vsx_xxsel_4si", VSX_BUILTIN_XXSEL_4SI }, + { MASK_VSX, CODE_FOR_vector_select_v8hi, "__builtin_vsx_xxsel_8hi", VSX_BUILTIN_XXSEL_8HI }, + { MASK_VSX, CODE_FOR_vector_select_v16qi, "__builtin_vsx_xxsel_16qi", VSX_BUILTIN_XXSEL_16QI }, + { MASK_VSX, CODE_FOR_vector_select_v2di_uns, "__builtin_vsx_xxsel_2di_uns", VSX_BUILTIN_XXSEL_2DI_UNS }, + { MASK_VSX, CODE_FOR_vector_select_v4si_uns, "__builtin_vsx_xxsel_4si_uns", VSX_BUILTIN_XXSEL_4SI_UNS }, + { MASK_VSX, CODE_FOR_vector_select_v8hi_uns, "__builtin_vsx_xxsel_8hi_uns", VSX_BUILTIN_XXSEL_8HI_UNS }, + { MASK_VSX, CODE_FOR_vector_select_v16qi_uns, "__builtin_vsx_xxsel_16qi_uns", VSX_BUILTIN_XXSEL_16QI_UNS }, + + { MASK_VSX, CODE_FOR_altivec_vperm_v2di, "__builtin_vsx_vperm_2di", VSX_BUILTIN_VPERM_2DI }, + { MASK_VSX, CODE_FOR_altivec_vperm_v2df, "__builtin_vsx_vperm_2df", VSX_BUILTIN_VPERM_2DF }, + { MASK_VSX, CODE_FOR_altivec_vperm_v4sf, "__builtin_vsx_vperm_4sf", VSX_BUILTIN_VPERM_4SF }, + { MASK_VSX, CODE_FOR_altivec_vperm_v4si, "__builtin_vsx_vperm_4si", VSX_BUILTIN_VPERM_4SI }, + { MASK_VSX, CODE_FOR_altivec_vperm_v8hi, "__builtin_vsx_vperm_8hi", VSX_BUILTIN_VPERM_8HI }, + { MASK_VSX, CODE_FOR_altivec_vperm_v16qi, "__builtin_vsx_vperm_16qi", VSX_BUILTIN_VPERM_16QI }, + { MASK_VSX, CODE_FOR_altivec_vperm_v2di_uns, "__builtin_vsx_vperm_2di_uns", VSX_BUILTIN_VPERM_2DI_UNS }, + { MASK_VSX, CODE_FOR_altivec_vperm_v4si_uns, "__builtin_vsx_vperm_4si_uns", VSX_BUILTIN_VPERM_4SI_UNS }, + { MASK_VSX, CODE_FOR_altivec_vperm_v8hi_uns, "__builtin_vsx_vperm_8hi_uns", VSX_BUILTIN_VPERM_8HI_UNS }, + { MASK_VSX, CODE_FOR_altivec_vperm_v16qi_uns, "__builtin_vsx_vperm_16qi_uns", VSX_BUILTIN_VPERM_16QI_UNS }, + + { MASK_VSX, CODE_FOR_vsx_xxpermdi_v2df, "__builtin_vsx_xxpermdi_2df", VSX_BUILTIN_XXPERMDI_2DF }, + { MASK_VSX, CODE_FOR_vsx_xxpermdi_v2di, "__builtin_vsx_xxpermdi_2di", VSX_BUILTIN_XXPERMDI_2DI }, + { MASK_VSX, CODE_FOR_vsx_xxpermdi_v4sf, "__builtin_vsx_xxpermdi_4sf", VSX_BUILTIN_XXPERMDI_4SF }, + { MASK_VSX, CODE_FOR_vsx_xxpermdi_v4si, "__builtin_vsx_xxpermdi_4si", VSX_BUILTIN_XXPERMDI_4SI }, + { MASK_VSX, CODE_FOR_vsx_xxpermdi_v8hi, "__builtin_vsx_xxpermdi_8hi", VSX_BUILTIN_XXPERMDI_8HI }, + { MASK_VSX, CODE_FOR_vsx_xxpermdi_v16qi, "__builtin_vsx_xxpermdi_16qi", VSX_BUILTIN_XXPERMDI_16QI }, + { MASK_VSX, CODE_FOR_nothing, "__builtin_vsx_xxpermdi", VSX_BUILTIN_VEC_XXPERMDI }, + { MASK_VSX, CODE_FOR_vsx_set_v2df, "__builtin_vsx_set_2df", VSX_BUILTIN_SET_2DF }, + { MASK_VSX, CODE_FOR_vsx_set_v2di, "__builtin_vsx_set_2di", VSX_BUILTIN_SET_2DI }, + + { MASK_VSX, CODE_FOR_vsx_xxsldwi_v2di, "__builtin_vsx_xxsldwi_2di", VSX_BUILTIN_XXSLDWI_2DI }, + { MASK_VSX, CODE_FOR_vsx_xxsldwi_v2df, "__builtin_vsx_xxsldwi_2df", VSX_BUILTIN_XXSLDWI_2DF }, + { MASK_VSX, CODE_FOR_vsx_xxsldwi_v4sf, "__builtin_vsx_xxsldwi_4sf", VSX_BUILTIN_XXSLDWI_4SF }, + { MASK_VSX, CODE_FOR_vsx_xxsldwi_v4si, "__builtin_vsx_xxsldwi_4si", VSX_BUILTIN_XXSLDWI_4SI }, + { MASK_VSX, CODE_FOR_vsx_xxsldwi_v8hi, "__builtin_vsx_xxsldwi_8hi", VSX_BUILTIN_XXSLDWI_8HI }, + { MASK_VSX, CODE_FOR_vsx_xxsldwi_v16qi, "__builtin_vsx_xxsldwi_16qi", VSX_BUILTIN_XXSLDWI_16QI }, + { MASK_VSX, CODE_FOR_nothing, "__builtin_vsx_xxsldwi", VSX_BUILTIN_VEC_XXSLDWI }, + { 0, CODE_FOR_paired_msub, "__builtin_paired_msub", PAIRED_BUILTIN_MSUB }, { 0, CODE_FOR_paired_madd, "__builtin_paired_madd", PAIRED_BUILTIN_MADD }, { 0, CODE_FOR_paired_madds0, "__builtin_paired_madds0", PAIRED_BUILTIN_MADDS0 }, @@ -7012,18 +8571,18 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_altivec_vcfux, "__builtin_altivec_vcfux", ALTIVEC_BUILTIN_VCFUX }, { MASK_ALTIVEC, CODE_FOR_altivec_vcfsx, "__builtin_altivec_vcfsx", ALTIVEC_BUILTIN_VCFSX }, { MASK_ALTIVEC, CODE_FOR_altivec_vcmpbfp, "__builtin_altivec_vcmpbfp", ALTIVEC_BUILTIN_VCMPBFP }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpequb, "__builtin_altivec_vcmpequb", ALTIVEC_BUILTIN_VCMPEQUB }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpequh, "__builtin_altivec_vcmpequh", ALTIVEC_BUILTIN_VCMPEQUH }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpequw, "__builtin_altivec_vcmpequw", ALTIVEC_BUILTIN_VCMPEQUW }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpeqfp, "__builtin_altivec_vcmpeqfp", ALTIVEC_BUILTIN_VCMPEQFP }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgefp, "__builtin_altivec_vcmpgefp", ALTIVEC_BUILTIN_VCMPGEFP }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtub, "__builtin_altivec_vcmpgtub", ALTIVEC_BUILTIN_VCMPGTUB }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtsb, "__builtin_altivec_vcmpgtsb", ALTIVEC_BUILTIN_VCMPGTSB }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtuh, "__builtin_altivec_vcmpgtuh", ALTIVEC_BUILTIN_VCMPGTUH }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtsh, "__builtin_altivec_vcmpgtsh", ALTIVEC_BUILTIN_VCMPGTSH }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtuw, "__builtin_altivec_vcmpgtuw", ALTIVEC_BUILTIN_VCMPGTUW }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtsw, "__builtin_altivec_vcmpgtsw", ALTIVEC_BUILTIN_VCMPGTSW }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtfp, "__builtin_altivec_vcmpgtfp", ALTIVEC_BUILTIN_VCMPGTFP }, + { MASK_ALTIVEC, CODE_FOR_vector_eqv16qi, "__builtin_altivec_vcmpequb", ALTIVEC_BUILTIN_VCMPEQUB }, + { MASK_ALTIVEC, CODE_FOR_vector_eqv8hi, "__builtin_altivec_vcmpequh", ALTIVEC_BUILTIN_VCMPEQUH }, + { MASK_ALTIVEC, CODE_FOR_vector_eqv4si, "__builtin_altivec_vcmpequw", ALTIVEC_BUILTIN_VCMPEQUW }, + { MASK_ALTIVEC, CODE_FOR_vector_eqv4sf, "__builtin_altivec_vcmpeqfp", ALTIVEC_BUILTIN_VCMPEQFP }, + { MASK_ALTIVEC, CODE_FOR_vector_gev4sf, "__builtin_altivec_vcmpgefp", ALTIVEC_BUILTIN_VCMPGEFP }, + { MASK_ALTIVEC, CODE_FOR_vector_gtuv16qi, "__builtin_altivec_vcmpgtub", ALTIVEC_BUILTIN_VCMPGTUB }, + { MASK_ALTIVEC, CODE_FOR_vector_gtuv8hi, "__builtin_altivec_vcmpgtsb", ALTIVEC_BUILTIN_VCMPGTSB }, + { MASK_ALTIVEC, CODE_FOR_vector_gtuv4si, "__builtin_altivec_vcmpgtuh", ALTIVEC_BUILTIN_VCMPGTUH }, + { MASK_ALTIVEC, CODE_FOR_vector_gtv16qi, "__builtin_altivec_vcmpgtsh", ALTIVEC_BUILTIN_VCMPGTSH }, + { MASK_ALTIVEC, CODE_FOR_vector_gtv8hi, "__builtin_altivec_vcmpgtuw", ALTIVEC_BUILTIN_VCMPGTUW }, + { MASK_ALTIVEC, CODE_FOR_vector_gtv4si, "__builtin_altivec_vcmpgtsw", ALTIVEC_BUILTIN_VCMPGTSW }, + { MASK_ALTIVEC, CODE_FOR_vector_gtv4sf, "__builtin_altivec_vcmpgtfp", ALTIVEC_BUILTIN_VCMPGTFP }, { MASK_ALTIVEC, CODE_FOR_altivec_vctsxs, "__builtin_altivec_vctsxs", ALTIVEC_BUILTIN_VCTSXS }, { MASK_ALTIVEC, CODE_FOR_altivec_vctuxs, "__builtin_altivec_vctuxs", ALTIVEC_BUILTIN_VCTUXS }, { MASK_ALTIVEC, CODE_FOR_umaxv16qi3, "__builtin_altivec_vmaxub", ALTIVEC_BUILTIN_VMAXUB }, @@ -7047,14 +8606,18 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_sminv4si3, "__builtin_altivec_vminsw", ALTIVEC_BUILTIN_VMINSW }, { MASK_ALTIVEC, CODE_FOR_sminv4sf3, "__builtin_altivec_vminfp", ALTIVEC_BUILTIN_VMINFP }, { MASK_ALTIVEC, CODE_FOR_altivec_vmuleub, "__builtin_altivec_vmuleub", ALTIVEC_BUILTIN_VMULEUB }, + { MASK_ALTIVEC, CODE_FOR_altivec_vmuleub, "__builtin_altivec_vmuleub_uns", ALTIVEC_BUILTIN_VMULEUB_UNS }, { MASK_ALTIVEC, CODE_FOR_altivec_vmulesb, "__builtin_altivec_vmulesb", ALTIVEC_BUILTIN_VMULESB }, { MASK_ALTIVEC, CODE_FOR_altivec_vmuleuh, "__builtin_altivec_vmuleuh", ALTIVEC_BUILTIN_VMULEUH }, + { MASK_ALTIVEC, CODE_FOR_altivec_vmuleuh, "__builtin_altivec_vmuleuh_uns", ALTIVEC_BUILTIN_VMULEUH_UNS }, { MASK_ALTIVEC, CODE_FOR_altivec_vmulesh, "__builtin_altivec_vmulesh", ALTIVEC_BUILTIN_VMULESH }, { MASK_ALTIVEC, CODE_FOR_altivec_vmuloub, "__builtin_altivec_vmuloub", ALTIVEC_BUILTIN_VMULOUB }, + { MASK_ALTIVEC, CODE_FOR_altivec_vmuloub, "__builtin_altivec_vmuloub_uns", ALTIVEC_BUILTIN_VMULOUB_UNS }, { MASK_ALTIVEC, CODE_FOR_altivec_vmulosb, "__builtin_altivec_vmulosb", ALTIVEC_BUILTIN_VMULOSB }, { MASK_ALTIVEC, CODE_FOR_altivec_vmulouh, "__builtin_altivec_vmulouh", ALTIVEC_BUILTIN_VMULOUH }, + { MASK_ALTIVEC, CODE_FOR_altivec_vmulouh, "__builtin_altivec_vmulouh_uns", ALTIVEC_BUILTIN_VMULOUH_UNS }, { MASK_ALTIVEC, CODE_FOR_altivec_vmulosh, "__builtin_altivec_vmulosh", ALTIVEC_BUILTIN_VMULOSH }, - { MASK_ALTIVEC, CODE_FOR_altivec_norv4si3, "__builtin_altivec_vnor", ALTIVEC_BUILTIN_VNOR }, + { MASK_ALTIVEC, CODE_FOR_norv4si3, "__builtin_altivec_vnor", ALTIVEC_BUILTIN_VNOR }, { MASK_ALTIVEC, CODE_FOR_iorv4si3, "__builtin_altivec_vor", ALTIVEC_BUILTIN_VOR }, { MASK_ALTIVEC, CODE_FOR_altivec_vpkuhum, "__builtin_altivec_vpkuhum", ALTIVEC_BUILTIN_VPKUHUM }, { MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum, "__builtin_altivec_vpkuwum", ALTIVEC_BUILTIN_VPKUWUM }, @@ -7065,9 +8628,9 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_altivec_vpkshus, "__builtin_altivec_vpkshus", ALTIVEC_BUILTIN_VPKSHUS }, { MASK_ALTIVEC, CODE_FOR_altivec_vpkuwus, "__builtin_altivec_vpkuwus", ALTIVEC_BUILTIN_VPKUWUS }, { MASK_ALTIVEC, CODE_FOR_altivec_vpkswus, "__builtin_altivec_vpkswus", ALTIVEC_BUILTIN_VPKSWUS }, - { MASK_ALTIVEC, CODE_FOR_altivec_vrlb, "__builtin_altivec_vrlb", ALTIVEC_BUILTIN_VRLB }, - { MASK_ALTIVEC, CODE_FOR_altivec_vrlh, "__builtin_altivec_vrlh", ALTIVEC_BUILTIN_VRLH }, - { MASK_ALTIVEC, CODE_FOR_altivec_vrlw, "__builtin_altivec_vrlw", ALTIVEC_BUILTIN_VRLW }, + { MASK_ALTIVEC, CODE_FOR_vrotlv16qi3, "__builtin_altivec_vrlb", ALTIVEC_BUILTIN_VRLB }, + { MASK_ALTIVEC, CODE_FOR_vrotlv8hi3, "__builtin_altivec_vrlh", ALTIVEC_BUILTIN_VRLH }, + { MASK_ALTIVEC, CODE_FOR_vrotlv4si3, "__builtin_altivec_vrlw", ALTIVEC_BUILTIN_VRLW }, { MASK_ALTIVEC, CODE_FOR_vashlv16qi3, "__builtin_altivec_vslb", ALTIVEC_BUILTIN_VSLB }, { MASK_ALTIVEC, CODE_FOR_vashlv8hi3, "__builtin_altivec_vslh", ALTIVEC_BUILTIN_VSLH }, { MASK_ALTIVEC, CODE_FOR_vashlv4si3, "__builtin_altivec_vslw", ALTIVEC_BUILTIN_VSLW }, @@ -7101,9 +8664,50 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_altivec_vsum2sws, "__builtin_altivec_vsum2sws", ALTIVEC_BUILTIN_VSUM2SWS }, { MASK_ALTIVEC, CODE_FOR_altivec_vsumsws, "__builtin_altivec_vsumsws", ALTIVEC_BUILTIN_VSUMSWS }, { MASK_ALTIVEC, CODE_FOR_xorv4si3, "__builtin_altivec_vxor", ALTIVEC_BUILTIN_VXOR }, + { MASK_ALTIVEC, CODE_FOR_vector_copysignv4sf3, "__builtin_altivec_copysignfp", ALTIVEC_BUILTIN_COPYSIGN_V4SF }, + + { MASK_VSX, CODE_FOR_addv2df3, "__builtin_vsx_xvadddp", VSX_BUILTIN_XVADDDP }, + { MASK_VSX, CODE_FOR_subv2df3, "__builtin_vsx_xvsubdp", VSX_BUILTIN_XVSUBDP }, + { MASK_VSX, CODE_FOR_mulv2df3, "__builtin_vsx_xvmuldp", VSX_BUILTIN_XVMULDP }, + { MASK_VSX, CODE_FOR_divv2df3, "__builtin_vsx_xvdivdp", VSX_BUILTIN_XVDIVDP }, + { MASK_VSX, CODE_FOR_sminv2df3, "__builtin_vsx_xvmindp", VSX_BUILTIN_XVMINDP }, + { MASK_VSX, CODE_FOR_smaxv2df3, "__builtin_vsx_xvmaxdp", VSX_BUILTIN_XVMAXDP }, + { MASK_VSX, CODE_FOR_vsx_tdivv2df3_fe, "__builtin_vsx_xvtdivdp_fe", VSX_BUILTIN_XVTDIVDP_FE }, + { MASK_VSX, CODE_FOR_vsx_tdivv2df3_fg, "__builtin_vsx_xvtdivdp_fg", VSX_BUILTIN_XVTDIVDP_FG }, + { MASK_VSX, CODE_FOR_vector_eqv2df, "__builtin_vsx_xvcmpeqdp", VSX_BUILTIN_XVCMPEQDP }, + { MASK_VSX, CODE_FOR_vector_gtv2df, "__builtin_vsx_xvcmpgtdp", VSX_BUILTIN_XVCMPGTDP }, + { MASK_VSX, CODE_FOR_vector_gev2df, "__builtin_vsx_xvcmpgedp", VSX_BUILTIN_XVCMPGEDP }, + + { MASK_VSX, CODE_FOR_addv4sf3, "__builtin_vsx_xvaddsp", VSX_BUILTIN_XVADDSP }, + { MASK_VSX, CODE_FOR_subv4sf3, "__builtin_vsx_xvsubsp", VSX_BUILTIN_XVSUBSP }, + { MASK_VSX, CODE_FOR_mulv4sf3, "__builtin_vsx_xvmulsp", VSX_BUILTIN_XVMULSP }, + { MASK_VSX, CODE_FOR_divv4sf3, "__builtin_vsx_xvdivsp", VSX_BUILTIN_XVDIVSP }, + { MASK_VSX, CODE_FOR_sminv4sf3, "__builtin_vsx_xvminsp", VSX_BUILTIN_XVMINSP }, + { MASK_VSX, CODE_FOR_smaxv4sf3, "__builtin_vsx_xvmaxsp", VSX_BUILTIN_XVMAXSP }, + { MASK_VSX, CODE_FOR_vsx_tdivv4sf3_fe, "__builtin_vsx_xvtdivsp_fe", VSX_BUILTIN_XVTDIVSP_FE }, + { MASK_VSX, CODE_FOR_vsx_tdivv4sf3_fg, "__builtin_vsx_xvtdivsp_fg", VSX_BUILTIN_XVTDIVSP_FG }, + { MASK_VSX, CODE_FOR_vector_eqv4sf, "__builtin_vsx_xvcmpeqsp", VSX_BUILTIN_XVCMPEQSP }, + { MASK_VSX, CODE_FOR_vector_gtv4sf, "__builtin_vsx_xvcmpgtsp", VSX_BUILTIN_XVCMPGTSP }, + { MASK_VSX, CODE_FOR_vector_gev4sf, "__builtin_vsx_xvcmpgesp", VSX_BUILTIN_XVCMPGESP }, + + { MASK_VSX, CODE_FOR_smindf3, "__builtin_vsx_xsmindp", VSX_BUILTIN_XSMINDP }, + { MASK_VSX, CODE_FOR_smaxdf3, "__builtin_vsx_xsmaxdp", VSX_BUILTIN_XSMAXDP }, + { MASK_VSX, CODE_FOR_vsx_tdivdf3_fe, "__builtin_vsx_xstdivdp_fe", VSX_BUILTIN_XSTDIVDP_FE }, + { MASK_VSX, CODE_FOR_vsx_tdivdf3_fg, "__builtin_vsx_xstdivdp_fg", VSX_BUILTIN_XSTDIVDP_FG }, + { MASK_VSX, CODE_FOR_vector_copysignv2df3, "__builtin_vsx_cpsgndp", VSX_BUILTIN_CPSGNDP }, + { MASK_VSX, CODE_FOR_vector_copysignv4sf3, "__builtin_vsx_cpsgnsp", VSX_BUILTIN_CPSGNSP }, + + { MASK_VSX, CODE_FOR_vsx_concat_v2df, "__builtin_vsx_concat_2df", VSX_BUILTIN_CONCAT_2DF }, + { MASK_VSX, CODE_FOR_vsx_concat_v2di, "__builtin_vsx_concat_2di", VSX_BUILTIN_CONCAT_2DI }, + { MASK_VSX, CODE_FOR_vsx_splat_v2df, "__builtin_vsx_splat_2df", VSX_BUILTIN_SPLAT_2DF }, + { MASK_VSX, CODE_FOR_vsx_splat_v2di, "__builtin_vsx_splat_2di", VSX_BUILTIN_SPLAT_2DI }, + { MASK_VSX, CODE_FOR_vsx_xxmrghw_v4sf, "__builtin_vsx_xxmrghw", VSX_BUILTIN_XXMRGHW_4SF }, + { MASK_VSX, CODE_FOR_vsx_xxmrghw_v4si, "__builtin_vsx_xxmrghw_4si", VSX_BUILTIN_XXMRGHW_4SI }, + { MASK_VSX, CODE_FOR_vsx_xxmrglw_v4sf, "__builtin_vsx_xxmrglw", VSX_BUILTIN_XXMRGLW_4SF }, + { MASK_VSX, CODE_FOR_vsx_xxmrglw_v4si, "__builtin_vsx_xxmrglw_4si", VSX_BUILTIN_XXMRGLW_4SI }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_add", ALTIVEC_BUILTIN_VEC_ADD }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vaddfp", ALTIVEC_BUILTIN_VEC_VADDFP }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_add", ALTIVEC_BUILTIN_VEC_ADD }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vaddfp", ALTIVEC_BUILTIN_VEC_VADDFP }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vadduwm", ALTIVEC_BUILTIN_VEC_VADDUWM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vadduhm", ALTIVEC_BUILTIN_VEC_VADDUHM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vaddubm", ALTIVEC_BUILTIN_VEC_VADDUBM }, @@ -7115,8 +8719,8 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vadduhs", ALTIVEC_BUILTIN_VEC_VADDUHS }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vaddsbs", ALTIVEC_BUILTIN_VEC_VADDSBS }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vaddubs", ALTIVEC_BUILTIN_VEC_VADDUBS }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_and", ALTIVEC_BUILTIN_VEC_AND }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_andc", ALTIVEC_BUILTIN_VEC_ANDC }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_and", ALTIVEC_BUILTIN_VEC_AND }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_andc", ALTIVEC_BUILTIN_VEC_ANDC }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_avg", ALTIVEC_BUILTIN_VEC_AVG }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vavgsw", ALTIVEC_BUILTIN_VEC_VAVGSW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vavguw", ALTIVEC_BUILTIN_VEC_VAVGUW }, @@ -7141,8 +8745,9 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vcmpgtub", ALTIVEC_BUILTIN_VEC_VCMPGTUB }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_cmple", ALTIVEC_BUILTIN_VEC_CMPLE }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_cmplt", ALTIVEC_BUILTIN_VEC_CMPLT }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_max", ALTIVEC_BUILTIN_VEC_MAX }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmaxfp", ALTIVEC_BUILTIN_VEC_VMAXFP }, + { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_copysign", ALTIVEC_BUILTIN_VEC_COPYSIGN }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_max", ALTIVEC_BUILTIN_VEC_MAX }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vmaxfp", ALTIVEC_BUILTIN_VEC_VMAXFP }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmaxsw", ALTIVEC_BUILTIN_VEC_VMAXSW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmaxuw", ALTIVEC_BUILTIN_VEC_VMAXUW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmaxsh", ALTIVEC_BUILTIN_VEC_VMAXSH }, @@ -7157,8 +8762,8 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmrglw", ALTIVEC_BUILTIN_VEC_VMRGLW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmrglh", ALTIVEC_BUILTIN_VEC_VMRGLH }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmrglb", ALTIVEC_BUILTIN_VEC_VMRGLB }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_min", ALTIVEC_BUILTIN_VEC_MIN }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vminfp", ALTIVEC_BUILTIN_VEC_VMINFP }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_min", ALTIVEC_BUILTIN_VEC_MIN }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vminfp", ALTIVEC_BUILTIN_VEC_VMINFP }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vminsw", ALTIVEC_BUILTIN_VEC_VMINSW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vminuw", ALTIVEC_BUILTIN_VEC_VMINUW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vminsh", ALTIVEC_BUILTIN_VEC_VMINSH }, @@ -7175,8 +8780,8 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmulouh", ALTIVEC_BUILTIN_VEC_VMULOUH }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmulosb", ALTIVEC_BUILTIN_VEC_VMULOSB }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmuloub", ALTIVEC_BUILTIN_VEC_VMULOUB }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_nor", ALTIVEC_BUILTIN_VEC_NOR }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_or", ALTIVEC_BUILTIN_VEC_OR }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_nor", ALTIVEC_BUILTIN_VEC_NOR }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_or", ALTIVEC_BUILTIN_VEC_OR }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_pack", ALTIVEC_BUILTIN_VEC_PACK }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vpkuwum", ALTIVEC_BUILTIN_VEC_VPKUWUM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vpkuhum", ALTIVEC_BUILTIN_VEC_VPKUHUM }, @@ -7209,8 +8814,8 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsrab", ALTIVEC_BUILTIN_VEC_VSRAB }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_srl", ALTIVEC_BUILTIN_VEC_SRL }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_sro", ALTIVEC_BUILTIN_VEC_SRO }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_sub", ALTIVEC_BUILTIN_VEC_SUB }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsubfp", ALTIVEC_BUILTIN_VEC_VSUBFP }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_sub", ALTIVEC_BUILTIN_VEC_SUB }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vsubfp", ALTIVEC_BUILTIN_VEC_VSUBFP }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsubuwm", ALTIVEC_BUILTIN_VEC_VSUBUWM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsubuhm", ALTIVEC_BUILTIN_VEC_VSUBUHM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsububm", ALTIVEC_BUILTIN_VEC_VSUBUBM }, @@ -7228,7 +8833,10 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsum4ubs", ALTIVEC_BUILTIN_VEC_VSUM4UBS }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_sum2s", ALTIVEC_BUILTIN_VEC_SUM2S }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_sums", ALTIVEC_BUILTIN_VEC_SUMS }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_xor", ALTIVEC_BUILTIN_VEC_XOR }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_xor", ALTIVEC_BUILTIN_VEC_XOR }, + + { MASK_VSX, CODE_FOR_nothing, "__builtin_vec_mul", VSX_BUILTIN_VEC_MUL }, + { MASK_VSX, CODE_FOR_nothing, "__builtin_vec_div", VSX_BUILTIN_VEC_DIV }, { 0, CODE_FOR_divv2sf3, "__builtin_paired_divv2sf3", PAIRED_BUILTIN_DIVV2SF3 }, { 0, CODE_FOR_addv2sf3, "__builtin_paired_addv2sf3", PAIRED_BUILTIN_ADDV2SF3 }, @@ -7392,30 +9000,58 @@ struct builtin_description_predicates { const unsigned int mask; const enum insn_code icode; - const char *opcode; const char *const name; const enum rs6000_builtins code; }; static const struct builtin_description_predicates bdesc_altivec_preds[] = { - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v4sf, "*vcmpbfp.", "__builtin_altivec_vcmpbfp_p", ALTIVEC_BUILTIN_VCMPBFP_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v4sf, "*vcmpeqfp.", "__builtin_altivec_vcmpeqfp_p", ALTIVEC_BUILTIN_VCMPEQFP_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v4sf, "*vcmpgefp.", "__builtin_altivec_vcmpgefp_p", ALTIVEC_BUILTIN_VCMPGEFP_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v4sf, "*vcmpgtfp.", "__builtin_altivec_vcmpgtfp_p", ALTIVEC_BUILTIN_VCMPGTFP_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v4si, "*vcmpequw.", "__builtin_altivec_vcmpequw_p", ALTIVEC_BUILTIN_VCMPEQUW_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v4si, "*vcmpgtsw.", "__builtin_altivec_vcmpgtsw_p", ALTIVEC_BUILTIN_VCMPGTSW_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v4si, "*vcmpgtuw.", "__builtin_altivec_vcmpgtuw_p", ALTIVEC_BUILTIN_VCMPGTUW_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v8hi, "*vcmpgtuh.", "__builtin_altivec_vcmpgtuh_p", ALTIVEC_BUILTIN_VCMPGTUH_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v8hi, "*vcmpgtsh.", "__builtin_altivec_vcmpgtsh_p", ALTIVEC_BUILTIN_VCMPGTSH_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v8hi, "*vcmpequh.", "__builtin_altivec_vcmpequh_p", ALTIVEC_BUILTIN_VCMPEQUH_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v16qi, "*vcmpequb.", "__builtin_altivec_vcmpequb_p", ALTIVEC_BUILTIN_VCMPEQUB_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v16qi, "*vcmpgtsb.", "__builtin_altivec_vcmpgtsb_p", ALTIVEC_BUILTIN_VCMPGTSB_P }, - { MASK_ALTIVEC, CODE_FOR_altivec_predicate_v16qi, "*vcmpgtub.", "__builtin_altivec_vcmpgtub_p", ALTIVEC_BUILTIN_VCMPGTUB_P }, - - { MASK_ALTIVEC, 0, NULL, "__builtin_vec_vcmpeq_p", ALTIVEC_BUILTIN_VCMPEQ_P }, - { MASK_ALTIVEC, 0, NULL, "__builtin_vec_vcmpgt_p", ALTIVEC_BUILTIN_VCMPGT_P }, - { MASK_ALTIVEC, 0, NULL, "__builtin_vec_vcmpge_p", ALTIVEC_BUILTIN_VCMPGE_P } + { MASK_ALTIVEC, CODE_FOR_altivec_vcmpbfp_p, "__builtin_altivec_vcmpbfp_p", + ALTIVEC_BUILTIN_VCMPBFP_P }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_vector_eq_v4sf_p, + "__builtin_altivec_vcmpeqfp_p", ALTIVEC_BUILTIN_VCMPEQFP_P }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_vector_ge_v4sf_p, + "__builtin_altivec_vcmpgefp_p", ALTIVEC_BUILTIN_VCMPGEFP_P }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_vector_gt_v4sf_p, + "__builtin_altivec_vcmpgtfp_p", ALTIVEC_BUILTIN_VCMPGTFP_P }, + { MASK_ALTIVEC, CODE_FOR_vector_eq_v4si_p, "__builtin_altivec_vcmpequw_p", + ALTIVEC_BUILTIN_VCMPEQUW_P }, + { MASK_ALTIVEC, CODE_FOR_vector_gt_v4si_p, "__builtin_altivec_vcmpgtsw_p", + ALTIVEC_BUILTIN_VCMPGTSW_P }, + { MASK_ALTIVEC, CODE_FOR_vector_gtu_v4si_p, "__builtin_altivec_vcmpgtuw_p", + ALTIVEC_BUILTIN_VCMPGTUW_P }, + { MASK_ALTIVEC, CODE_FOR_vector_eq_v8hi_p, "__builtin_altivec_vcmpequh_p", + ALTIVEC_BUILTIN_VCMPEQUH_P }, + { MASK_ALTIVEC, CODE_FOR_vector_gt_v8hi_p, "__builtin_altivec_vcmpgtsh_p", + ALTIVEC_BUILTIN_VCMPGTSH_P }, + { MASK_ALTIVEC, CODE_FOR_vector_gtu_v8hi_p, "__builtin_altivec_vcmpgtuh_p", + ALTIVEC_BUILTIN_VCMPGTUH_P }, + { MASK_ALTIVEC, CODE_FOR_vector_eq_v16qi_p, "__builtin_altivec_vcmpequb_p", + ALTIVEC_BUILTIN_VCMPEQUB_P }, + { MASK_ALTIVEC, CODE_FOR_vector_gt_v16qi_p, "__builtin_altivec_vcmpgtsb_p", + ALTIVEC_BUILTIN_VCMPGTSB_P }, + { MASK_ALTIVEC, CODE_FOR_vector_gtu_v16qi_p, "__builtin_altivec_vcmpgtub_p", + ALTIVEC_BUILTIN_VCMPGTUB_P }, + + { MASK_VSX, CODE_FOR_vector_eq_v4sf_p, "__builtin_vsx_xvcmpeqsp_p", + VSX_BUILTIN_XVCMPEQSP_P }, + { MASK_VSX, CODE_FOR_vector_ge_v4sf_p, "__builtin_vsx_xvcmpgesp_p", + VSX_BUILTIN_XVCMPGESP_P }, + { MASK_VSX, CODE_FOR_vector_gt_v4sf_p, "__builtin_vsx_xvcmpgtsp_p", + VSX_BUILTIN_XVCMPGTSP_P }, + { MASK_VSX, CODE_FOR_vector_eq_v2df_p, "__builtin_vsx_xvcmpeqdp_p", + VSX_BUILTIN_XVCMPEQDP_P }, + { MASK_VSX, CODE_FOR_vector_ge_v2df_p, "__builtin_vsx_xvcmpgedp_p", + VSX_BUILTIN_XVCMPGEDP_P }, + { MASK_VSX, CODE_FOR_vector_gt_v2df_p, "__builtin_vsx_xvcmpgtdp_p", + VSX_BUILTIN_XVCMPGTDP_P }, + + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vcmpeq_p", + ALTIVEC_BUILTIN_VCMPEQ_P }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vcmpgt_p", + ALTIVEC_BUILTIN_VCMPGT_P }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vcmpge_p", + ALTIVEC_BUILTIN_VCMPGE_P } }; /* SPE predicates. */ @@ -7473,7 +9109,11 @@ static const struct builtin_description { MASK_ALTIVEC, CODE_FOR_absv16qi2, "__builtin_altivec_abs_v16qi", ALTIVEC_BUILTIN_ABS_V16QI }, { MASK_ALTIVEC, CODE_FOR_altivec_abss_v4si, "__builtin_altivec_abss_v4si", ALTIVEC_BUILTIN_ABSS_V4SI }, { MASK_ALTIVEC, CODE_FOR_altivec_abss_v8hi, "__builtin_altivec_abss_v8hi", ALTIVEC_BUILTIN_ABSS_V8HI }, - { MASK_ALTIVEC, CODE_FOR_altivec_abss_v16qi, "__builtin_altivec_abss_v16qi", ALTIVEC_BUILTIN_ABSS_V16QI } + { MASK_ALTIVEC, CODE_FOR_altivec_abss_v16qi, "__builtin_altivec_abss_v16qi", ALTIVEC_BUILTIN_ABSS_V16QI }, + { MASK_VSX, CODE_FOR_absv2df2, "__builtin_vsx_xvabsdp", VSX_BUILTIN_XVABSDP }, + { MASK_VSX, CODE_FOR_vsx_nabsv2df2, "__builtin_vsx_xvnabsdp", VSX_BUILTIN_XVNABSDP }, + { MASK_VSX, CODE_FOR_absv4sf2, "__builtin_vsx_xvabssp", VSX_BUILTIN_XVABSSP }, + { MASK_VSX, CODE_FOR_vsx_nabsv4sf2, "__builtin_vsx_xvnabssp", VSX_BUILTIN_XVNABSSP }, }; /* Simple unary operations: VECb = foo (unsigned literal) or VECb = @@ -7484,10 +9124,10 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_altivec_vexptefp, "__builtin_altivec_vexptefp", ALTIVEC_BUILTIN_VEXPTEFP }, { MASK_ALTIVEC, CODE_FOR_altivec_vlogefp, "__builtin_altivec_vlogefp", ALTIVEC_BUILTIN_VLOGEFP }, { MASK_ALTIVEC, CODE_FOR_altivec_vrefp, "__builtin_altivec_vrefp", ALTIVEC_BUILTIN_VREFP }, - { MASK_ALTIVEC, CODE_FOR_altivec_vrfim, "__builtin_altivec_vrfim", ALTIVEC_BUILTIN_VRFIM }, + { MASK_ALTIVEC, CODE_FOR_vector_floorv4sf2, "__builtin_altivec_vrfim", ALTIVEC_BUILTIN_VRFIM }, { MASK_ALTIVEC, CODE_FOR_altivec_vrfin, "__builtin_altivec_vrfin", ALTIVEC_BUILTIN_VRFIN }, - { MASK_ALTIVEC, CODE_FOR_altivec_vrfip, "__builtin_altivec_vrfip", ALTIVEC_BUILTIN_VRFIP }, - { MASK_ALTIVEC, CODE_FOR_ftruncv4sf2, "__builtin_altivec_vrfiz", ALTIVEC_BUILTIN_VRFIZ }, + { MASK_ALTIVEC, CODE_FOR_vector_ceilv4sf2, "__builtin_altivec_vrfip", ALTIVEC_BUILTIN_VRFIP }, + { MASK_ALTIVEC, CODE_FOR_vector_btruncv4sf2, "__builtin_altivec_vrfiz", ALTIVEC_BUILTIN_VRFIZ }, { MASK_ALTIVEC, CODE_FOR_altivec_vrsqrtefp, "__builtin_altivec_vrsqrtefp", ALTIVEC_BUILTIN_VRSQRTEFP }, { MASK_ALTIVEC, CODE_FOR_altivec_vspltisb, "__builtin_altivec_vspltisb", ALTIVEC_BUILTIN_VSPLTISB }, { MASK_ALTIVEC, CODE_FOR_altivec_vspltish, "__builtin_altivec_vspltish", ALTIVEC_BUILTIN_VSPLTISH }, @@ -7499,6 +9139,65 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_altivec_vupklpx, "__builtin_altivec_vupklpx", ALTIVEC_BUILTIN_VUPKLPX }, { MASK_ALTIVEC, CODE_FOR_altivec_vupklsh, "__builtin_altivec_vupklsh", ALTIVEC_BUILTIN_VUPKLSH }, + { MASK_VSX, CODE_FOR_negv2df2, "__builtin_vsx_xvnegdp", VSX_BUILTIN_XVNEGDP }, + { MASK_VSX, CODE_FOR_sqrtv2df2, "__builtin_vsx_xvsqrtdp", VSX_BUILTIN_XVSQRTDP }, + { MASK_VSX, CODE_FOR_vsx_rsqrtev2df2, "__builtin_vsx_xvrsqrtedp", VSX_BUILTIN_XVRSQRTEDP }, + { MASK_VSX, CODE_FOR_vsx_tsqrtv2df2_fe, "__builtin_vsx_xvtsqrtdp_fe", VSX_BUILTIN_XVTSQRTDP_FE }, + { MASK_VSX, CODE_FOR_vsx_tsqrtv2df2_fg, "__builtin_vsx_xvtsqrtdp_fg", VSX_BUILTIN_XVTSQRTDP_FG }, + { MASK_VSX, CODE_FOR_vsx_frev2df2, "__builtin_vsx_xvredp", VSX_BUILTIN_XVREDP }, + + { MASK_VSX, CODE_FOR_negv4sf2, "__builtin_vsx_xvnegsp", VSX_BUILTIN_XVNEGSP }, + { MASK_VSX, CODE_FOR_sqrtv4sf2, "__builtin_vsx_xvsqrtsp", VSX_BUILTIN_XVSQRTSP }, + { MASK_VSX, CODE_FOR_vsx_rsqrtev4sf2, "__builtin_vsx_xvrsqrtesp", VSX_BUILTIN_XVRSQRTESP }, + { MASK_VSX, CODE_FOR_vsx_tsqrtv4sf2_fe, "__builtin_vsx_xvtsqrtsp_fe", VSX_BUILTIN_XVTSQRTSP_FE }, + { MASK_VSX, CODE_FOR_vsx_tsqrtv4sf2_fg, "__builtin_vsx_xvtsqrtsp_fg", VSX_BUILTIN_XVTSQRTSP_FG }, + { MASK_VSX, CODE_FOR_vsx_frev4sf2, "__builtin_vsx_xvresp", VSX_BUILTIN_XVRESP }, + + { MASK_VSX, CODE_FOR_vsx_xscvdpsp, "__builtin_vsx_xscvdpsp", VSX_BUILTIN_XSCVDPSP }, + { MASK_VSX, CODE_FOR_vsx_xscvdpsp, "__builtin_vsx_xscvspdp", VSX_BUILTIN_XSCVSPDP }, + { MASK_VSX, CODE_FOR_vsx_xvcvdpsp, "__builtin_vsx_xvcvdpsp", VSX_BUILTIN_XVCVDPSP }, + { MASK_VSX, CODE_FOR_vsx_xvcvspdp, "__builtin_vsx_xvcvspdp", VSX_BUILTIN_XVCVSPDP }, + { MASK_VSX, CODE_FOR_vsx_tsqrtdf2_fe, "__builtin_vsx_xstsqrtdp_fe", VSX_BUILTIN_XSTSQRTDP_FE }, + { MASK_VSX, CODE_FOR_vsx_tsqrtdf2_fg, "__builtin_vsx_xstsqrtdp_fg", VSX_BUILTIN_XSTSQRTDP_FG }, + + { MASK_VSX, CODE_FOR_vsx_fix_truncv2dfv2di2, "__builtin_vsx_xvcvdpsxds", VSX_BUILTIN_XVCVDPSXDS }, + { MASK_VSX, CODE_FOR_vsx_fixuns_truncv2dfv2di2, "__builtin_vsx_xvcvdpuxds", VSX_BUILTIN_XVCVDPUXDS }, + { MASK_VSX, CODE_FOR_vsx_fixuns_truncv2dfv2di2, "__builtin_vsx_xvcvdpuxds_uns", VSX_BUILTIN_XVCVDPUXDS_UNS }, + { MASK_VSX, CODE_FOR_vsx_floatv2div2df2, "__builtin_vsx_xvcvsxddp", VSX_BUILTIN_XVCVSXDDP }, + { MASK_VSX, CODE_FOR_vsx_floatunsv2div2df2, "__builtin_vsx_xvcvuxddp", VSX_BUILTIN_XVCVUXDDP }, + { MASK_VSX, CODE_FOR_vsx_floatunsv2div2df2, "__builtin_vsx_xvcvuxddp_uns", VSX_BUILTIN_XVCVUXDDP_UNS }, + + { MASK_VSX, CODE_FOR_vsx_fix_truncv4sfv4si2, "__builtin_vsx_xvcvspsxws", VSX_BUILTIN_XVCVSPSXWS }, + { MASK_VSX, CODE_FOR_vsx_fixuns_truncv4sfv4si2, "__builtin_vsx_xvcvspuxws", VSX_BUILTIN_XVCVSPUXWS }, + { MASK_VSX, CODE_FOR_vsx_floatv4siv4sf2, "__builtin_vsx_xvcvsxwsp", VSX_BUILTIN_XVCVSXWSP }, + { MASK_VSX, CODE_FOR_vsx_floatunsv4siv4sf2, "__builtin_vsx_xvcvuxwsp", VSX_BUILTIN_XVCVUXWSP }, + + { MASK_VSX, CODE_FOR_vsx_xvcvdpsxws, "__builtin_vsx_xvcvdpsxws", VSX_BUILTIN_XVCVDPSXWS }, + { MASK_VSX, CODE_FOR_vsx_xvcvdpuxws, "__builtin_vsx_xvcvdpuxws", VSX_BUILTIN_XVCVDPUXWS }, + { MASK_VSX, CODE_FOR_vsx_xvcvsxwdp, "__builtin_vsx_xvcvsxwdp", VSX_BUILTIN_XVCVSXWDP }, + { MASK_VSX, CODE_FOR_vsx_xvcvuxwdp, "__builtin_vsx_xvcvuxwdp", VSX_BUILTIN_XVCVUXWDP }, + { MASK_VSX, CODE_FOR_vsx_xvrdpi, "__builtin_vsx_xvrdpi", VSX_BUILTIN_XVRDPI }, + { MASK_VSX, CODE_FOR_vsx_xvrdpic, "__builtin_vsx_xvrdpic", VSX_BUILTIN_XVRDPIC }, + { MASK_VSX, CODE_FOR_vsx_floorv2df2, "__builtin_vsx_xvrdpim", VSX_BUILTIN_XVRDPIM }, + { MASK_VSX, CODE_FOR_vsx_ceilv2df2, "__builtin_vsx_xvrdpip", VSX_BUILTIN_XVRDPIP }, + { MASK_VSX, CODE_FOR_vsx_btruncv2df2, "__builtin_vsx_xvrdpiz", VSX_BUILTIN_XVRDPIZ }, + + { MASK_VSX, CODE_FOR_vsx_xvcvspsxds, "__builtin_vsx_xvcvspsxds", VSX_BUILTIN_XVCVSPSXDS }, + { MASK_VSX, CODE_FOR_vsx_xvcvspuxds, "__builtin_vsx_xvcvspuxds", VSX_BUILTIN_XVCVSPUXDS }, + { MASK_VSX, CODE_FOR_vsx_xvcvsxdsp, "__builtin_vsx_xvcvsxdsp", VSX_BUILTIN_XVCVSXDSP }, + { MASK_VSX, CODE_FOR_vsx_xvcvuxdsp, "__builtin_vsx_xvcvuxdsp", VSX_BUILTIN_XVCVUXDSP }, + { MASK_VSX, CODE_FOR_vsx_xvrspi, "__builtin_vsx_xvrspi", VSX_BUILTIN_XVRSPI }, + { MASK_VSX, CODE_FOR_vsx_xvrspic, "__builtin_vsx_xvrspic", VSX_BUILTIN_XVRSPIC }, + { MASK_VSX, CODE_FOR_vsx_floorv4sf2, "__builtin_vsx_xvrspim", VSX_BUILTIN_XVRSPIM }, + { MASK_VSX, CODE_FOR_vsx_ceilv4sf2, "__builtin_vsx_xvrspip", VSX_BUILTIN_XVRSPIP }, + { MASK_VSX, CODE_FOR_vsx_btruncv4sf2, "__builtin_vsx_xvrspiz", VSX_BUILTIN_XVRSPIZ }, + + { MASK_VSX, CODE_FOR_vsx_xsrdpi, "__builtin_vsx_xsrdpi", VSX_BUILTIN_XSRDPI }, + { MASK_VSX, CODE_FOR_vsx_xsrdpic, "__builtin_vsx_xsrdpic", VSX_BUILTIN_XSRDPIC }, + { MASK_VSX, CODE_FOR_vsx_floordf2, "__builtin_vsx_xsrdpim", VSX_BUILTIN_XSRDPIM }, + { MASK_VSX, CODE_FOR_vsx_ceildf2, "__builtin_vsx_xsrdpip", VSX_BUILTIN_XSRDPIP }, + { MASK_VSX, CODE_FOR_vsx_btruncdf2, "__builtin_vsx_xsrdpiz", VSX_BUILTIN_XSRDPIZ }, + { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_abs", ALTIVEC_BUILTIN_VEC_ABS }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_abss", ALTIVEC_BUILTIN_VEC_ABSS }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_ceil", ALTIVEC_BUILTIN_VEC_CEIL }, @@ -7519,6 +9218,15 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vupklsh", ALTIVEC_BUILTIN_VEC_VUPKLSH }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vupklsb", ALTIVEC_BUILTIN_VEC_VUPKLSB }, + { MASK_VSX, CODE_FOR_nothing, "__builtin_vec_nearbyint", ALTIVEC_BUILTIN_VEC_NEARBYINT }, + { MASK_VSX, CODE_FOR_nothing, "__builtin_vec_rint", ALTIVEC_BUILTIN_VEC_RINT }, + { MASK_VSX, CODE_FOR_nothing, "__builtin_vec_sqrt", ALTIVEC_BUILTIN_VEC_SQRT }, + + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_floatv4siv4sf2, "__builtin_vec_float_sisf", VECTOR_BUILTIN_FLOAT_V4SI_V4SF }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_unsigned_floatv4siv4sf2, "__builtin_vec_uns_float_sisf", VECTOR_BUILTIN_UNSFLOAT_V4SI_V4SF }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_fix_truncv4sfv4si2, "__builtin_vec_fix_sfsi", VECTOR_BUILTIN_FIX_V4SF_V4SI }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_fixuns_truncv4sfv4si2, "__builtin_vec_fixuns_sfsi", VECTOR_BUILTIN_FIXUNS_V4SF_V4SI }, + /* The SPE unary builtins must start with SPE_BUILTIN_EVABS and end with SPE_BUILTIN_EVSUBFUSIAAW. */ { 0, CODE_FOR_spe_evabs, "__builtin_spe_evabs", SPE_BUILTIN_EVABS }, @@ -7715,8 +9423,7 @@ rs6000_expand_binop_builtin (enum insn_c } static rtx -altivec_expand_predicate_builtin (enum insn_code icode, const char *opcode, - tree exp, rtx target) +altivec_expand_predicate_builtin (enum insn_code icode, tree exp, rtx target) { rtx pat, scratch; tree cr6_form = CALL_EXPR_ARG (exp, 0); @@ -7755,8 +9462,7 @@ altivec_expand_predicate_builtin (enum i scratch = gen_reg_rtx (mode0); - pat = GEN_FCN (icode) (scratch, op0, op1, - gen_rtx_SYMBOL_REF (Pmode, opcode)); + pat = GEN_FCN (icode) (scratch, op0, op1); if (! pat) return 0; emit_insn (pat); @@ -8023,11 +9729,12 @@ rs6000_expand_ternop_builtin (enum insn_ || arg2 == error_mark_node) return const0_rtx; - if (icode == CODE_FOR_altivec_vsldoi_v4sf - || icode == CODE_FOR_altivec_vsldoi_v4si - || icode == CODE_FOR_altivec_vsldoi_v8hi - || icode == CODE_FOR_altivec_vsldoi_v16qi) + switch (icode) { + case CODE_FOR_altivec_vsldoi_v4sf: + case CODE_FOR_altivec_vsldoi_v4si: + case CODE_FOR_altivec_vsldoi_v8hi: + case CODE_FOR_altivec_vsldoi_v16qi: /* Only allow 4-bit unsigned literals. */ STRIP_NOPS (arg2); if (TREE_CODE (arg2) != INTEGER_CST @@ -8036,6 +9743,40 @@ rs6000_expand_ternop_builtin (enum insn_ error ("argument 3 must be a 4-bit unsigned literal"); return const0_rtx; } + break; + + case CODE_FOR_vsx_xxpermdi_v2df: + case CODE_FOR_vsx_xxpermdi_v2di: + case CODE_FOR_vsx_xxsldwi_v16qi: + case CODE_FOR_vsx_xxsldwi_v8hi: + case CODE_FOR_vsx_xxsldwi_v4si: + case CODE_FOR_vsx_xxsldwi_v4sf: + case CODE_FOR_vsx_xxsldwi_v2di: + case CODE_FOR_vsx_xxsldwi_v2df: + /* Only allow 2-bit unsigned literals. */ + STRIP_NOPS (arg2); + if (TREE_CODE (arg2) != INTEGER_CST + || TREE_INT_CST_LOW (arg2) & ~0x3) + { + error ("argument 3 must be a 2-bit unsigned literal"); + return const0_rtx; + } + break; + + case CODE_FOR_vsx_set_v2df: + case CODE_FOR_vsx_set_v2di: + /* Only allow 1-bit unsigned literals. */ + STRIP_NOPS (arg2); + if (TREE_CODE (arg2) != INTEGER_CST + || TREE_INT_CST_LOW (arg2) & ~0x1) + { + error ("argument 3 must be a 1-bit unsigned literal"); + return const0_rtx; + } + break; + + default: + break; } if (target == 0 @@ -8075,16 +9816,16 @@ altivec_expand_ld_builtin (tree exp, rtx switch (fcode) { case ALTIVEC_BUILTIN_LD_INTERNAL_16qi: - icode = CODE_FOR_altivec_lvx_v16qi; + icode = CODE_FOR_vector_load_v16qi; break; case ALTIVEC_BUILTIN_LD_INTERNAL_8hi: - icode = CODE_FOR_altivec_lvx_v8hi; + icode = CODE_FOR_vector_load_v8hi; break; case ALTIVEC_BUILTIN_LD_INTERNAL_4si: - icode = CODE_FOR_altivec_lvx_v4si; + icode = CODE_FOR_vector_load_v4si; break; case ALTIVEC_BUILTIN_LD_INTERNAL_4sf: - icode = CODE_FOR_altivec_lvx_v4sf; + icode = CODE_FOR_vector_load_v4sf; break; default: *expandedp = false; @@ -8128,16 +9869,16 @@ altivec_expand_st_builtin (tree exp, rtx switch (fcode) { case ALTIVEC_BUILTIN_ST_INTERNAL_16qi: - icode = CODE_FOR_altivec_stvx_v16qi; + icode = CODE_FOR_vector_store_v16qi; break; case ALTIVEC_BUILTIN_ST_INTERNAL_8hi: - icode = CODE_FOR_altivec_stvx_v8hi; + icode = CODE_FOR_vector_store_v8hi; break; case ALTIVEC_BUILTIN_ST_INTERNAL_4si: - icode = CODE_FOR_altivec_stvx_v4si; + icode = CODE_FOR_vector_store_v4si; break; case ALTIVEC_BUILTIN_ST_INTERNAL_4sf: - icode = CODE_FOR_altivec_stvx_v4sf; + icode = CODE_FOR_vector_store_v4sf; break; default: *expandedp = false; @@ -8284,8 +10025,8 @@ altivec_expand_vec_set_builtin (tree exp mode1 = TYPE_MODE (TREE_TYPE (TREE_TYPE (arg0))); gcc_assert (VECTOR_MODE_P (tmode)); - op0 = expand_expr (arg0, NULL_RTX, tmode, 0); - op1 = expand_expr (arg1, NULL_RTX, mode1, 0); + op0 = expand_expr (arg0, NULL_RTX, tmode, EXPAND_NORMAL); + op1 = expand_expr (arg1, NULL_RTX, mode1, EXPAND_NORMAL); elt = get_element_number (TREE_TYPE (arg0), arg2); if (GET_MODE (op1) != mode1 && GET_MODE (op1) != VOIDmode) @@ -8343,8 +10084,10 @@ altivec_expand_builtin (tree exp, rtx ta enum machine_mode tmode, mode0; unsigned int fcode = DECL_FUNCTION_CODE (fndecl); - if (fcode >= ALTIVEC_BUILTIN_OVERLOADED_FIRST - && fcode <= ALTIVEC_BUILTIN_OVERLOADED_LAST) + if ((fcode >= ALTIVEC_BUILTIN_OVERLOADED_FIRST + && fcode <= ALTIVEC_BUILTIN_OVERLOADED_LAST) + || (fcode >= VSX_BUILTIN_OVERLOADED_FIRST + && fcode <= VSX_BUILTIN_OVERLOADED_LAST)) { *expandedp = true; error ("unresolved overload for Altivec builtin %qF", fndecl); @@ -8452,18 +10195,24 @@ altivec_expand_builtin (tree exp, rtx ta case ALTIVEC_BUILTIN_VEC_INIT_V8HI: case ALTIVEC_BUILTIN_VEC_INIT_V16QI: case ALTIVEC_BUILTIN_VEC_INIT_V4SF: + case VSX_BUILTIN_VEC_INIT_V2DF: + case VSX_BUILTIN_VEC_INIT_V2DI: return altivec_expand_vec_init_builtin (TREE_TYPE (exp), exp, target); case ALTIVEC_BUILTIN_VEC_SET_V4SI: case ALTIVEC_BUILTIN_VEC_SET_V8HI: case ALTIVEC_BUILTIN_VEC_SET_V16QI: case ALTIVEC_BUILTIN_VEC_SET_V4SF: + case VSX_BUILTIN_VEC_SET_V2DF: + case VSX_BUILTIN_VEC_SET_V2DI: return altivec_expand_vec_set_builtin (exp); case ALTIVEC_BUILTIN_VEC_EXT_V4SI: case ALTIVEC_BUILTIN_VEC_EXT_V8HI: case ALTIVEC_BUILTIN_VEC_EXT_V16QI: case ALTIVEC_BUILTIN_VEC_EXT_V4SF: + case VSX_BUILTIN_VEC_EXT_V2DF: + case VSX_BUILTIN_VEC_EXT_V2DI: return altivec_expand_vec_ext_builtin (exp, target); default: @@ -8481,8 +10230,7 @@ altivec_expand_builtin (tree exp, rtx ta dp = bdesc_altivec_preds; for (i = 0; i < ARRAY_SIZE (bdesc_altivec_preds); i++, dp++) if (dp->code == fcode) - return altivec_expand_predicate_builtin (dp->icode, dp->opcode, - exp, target); + return altivec_expand_predicate_builtin (dp->icode, exp, target); /* LV* are funky. We initialized them differently. */ switch (fcode) @@ -8976,13 +10724,21 @@ rs6000_expand_builtin (tree exp, rtx tar bool success; if (fcode == RS6000_BUILTIN_RECIP) - return rs6000_expand_binop_builtin (CODE_FOR_recipdf3, exp, target); + return rs6000_expand_binop_builtin (CODE_FOR_recipdf3, exp, target); if (fcode == RS6000_BUILTIN_RECIPF) - return rs6000_expand_binop_builtin (CODE_FOR_recipsf3, exp, target); + return rs6000_expand_binop_builtin (CODE_FOR_recipsf3, exp, target); if (fcode == RS6000_BUILTIN_RSQRTF) - return rs6000_expand_unop_builtin (CODE_FOR_rsqrtsf2, exp, target); + return rs6000_expand_unop_builtin (CODE_FOR_rsqrtsf2, exp, target); + + if (fcode == RS6000_BUILTIN_BSWAP_HI) + return rs6000_expand_unop_builtin (CODE_FOR_bswaphi2, exp, target); + + if (fcode == POWER7_BUILTIN_BPERMD) + return rs6000_expand_binop_builtin (((TARGET_64BIT) + ? CODE_FOR_bpermd_di + : CODE_FOR_bpermd_si), exp, target); if (fcode == ALTIVEC_BUILTIN_MASK_FOR_LOAD || fcode == ALTIVEC_BUILTIN_MASK_FOR_STORE) @@ -9027,7 +10783,9 @@ rs6000_expand_builtin (tree exp, rtx tar /* FIXME: There's got to be a nicer way to handle this case than constructing a new CALL_EXPR. */ if (fcode == ALTIVEC_BUILTIN_VCFUX - || fcode == ALTIVEC_BUILTIN_VCFSX) + || fcode == ALTIVEC_BUILTIN_VCFSX + || fcode == ALTIVEC_BUILTIN_VCTUXS + || fcode == ALTIVEC_BUILTIN_VCTSXS) { if (call_expr_nargs (exp) == 1) exp = build_call_nary (TREE_TYPE (exp), CALL_EXPR_FN (exp), @@ -9056,7 +10814,7 @@ rs6000_expand_builtin (tree exp, rtx tar return ret; } - gcc_assert (TARGET_ALTIVEC || TARGET_SPE || TARGET_PAIRED_FLOAT); + gcc_assert (TARGET_ALTIVEC || TARGET_VSX || TARGET_SPE || TARGET_PAIRED_FLOAT); /* Handle simple unary operations. */ d = (struct builtin_description *) bdesc_1arg; @@ -9093,6 +10851,8 @@ rs6000_init_builtins (void) { V2SI_type_node = build_vector_type (intSI_type_node, 2); V2SF_type_node = build_vector_type (float_type_node, 2); + V2DI_type_node = build_vector_type (intDI_type_node, 2); + V2DF_type_node = build_vector_type (double_type_node, 2); V4HI_type_node = build_vector_type (intHI_type_node, 4); V4SI_type_node = build_vector_type (intSI_type_node, 4); V4SF_type_node = build_vector_type (float_type_node, 4); @@ -9102,6 +10862,7 @@ rs6000_init_builtins (void) unsigned_V16QI_type_node = build_vector_type (unsigned_intQI_type_node, 16); unsigned_V8HI_type_node = build_vector_type (unsigned_intHI_type_node, 8); unsigned_V4SI_type_node = build_vector_type (unsigned_intSI_type_node, 4); + unsigned_V2DI_type_node = build_vector_type (unsigned_intDI_type_node, 2); opaque_V2SF_type_node = build_opaque_vector_type (float_type_node, 2); opaque_V2SI_type_node = build_opaque_vector_type (intSI_type_node, 2); @@ -9115,6 +10876,7 @@ rs6000_init_builtins (void) bool_char_type_node = build_distinct_type_copy (unsigned_intQI_type_node); bool_short_type_node = build_distinct_type_copy (unsigned_intHI_type_node); bool_int_type_node = build_distinct_type_copy (unsigned_intSI_type_node); + bool_long_type_node = build_distinct_type_copy (unsigned_intDI_type_node); pixel_type_node = build_distinct_type_copy (unsigned_intHI_type_node); long_integer_type_internal_node = long_integer_type_node; @@ -9125,9 +10887,36 @@ rs6000_init_builtins (void) uintHI_type_internal_node = unsigned_intHI_type_node; intSI_type_internal_node = intSI_type_node; uintSI_type_internal_node = unsigned_intSI_type_node; + intDI_type_internal_node = intDI_type_node; + uintDI_type_internal_node = unsigned_intDI_type_node; float_type_internal_node = float_type_node; + double_type_internal_node = float_type_node; void_type_internal_node = void_type_node; + /* Initialize the modes for builtin_function_type, mapping a machine mode to + tree type node. */ + builtin_mode_to_type[QImode][0] = integer_type_node; + builtin_mode_to_type[HImode][0] = integer_type_node; + builtin_mode_to_type[SImode][0] = intSI_type_node; + builtin_mode_to_type[SImode][1] = unsigned_intSI_type_node; + builtin_mode_to_type[DImode][0] = intDI_type_node; + builtin_mode_to_type[DImode][1] = unsigned_intDI_type_node; + builtin_mode_to_type[SFmode][0] = float_type_node; + builtin_mode_to_type[DFmode][0] = double_type_node; + builtin_mode_to_type[V2SImode][0] = V2SI_type_node; + builtin_mode_to_type[V2SFmode][0] = V2SF_type_node; + builtin_mode_to_type[V2DImode][0] = V2DI_type_node; + builtin_mode_to_type[V2DImode][1] = unsigned_V2DI_type_node; + builtin_mode_to_type[V2DFmode][0] = V2DF_type_node; + builtin_mode_to_type[V4HImode][0] = V4HI_type_node; + builtin_mode_to_type[V4SImode][0] = V4SI_type_node; + builtin_mode_to_type[V4SImode][1] = unsigned_V4SI_type_node; + builtin_mode_to_type[V4SFmode][0] = V4SF_type_node; + builtin_mode_to_type[V8HImode][0] = V8HI_type_node; + builtin_mode_to_type[V8HImode][1] = unsigned_V8HI_type_node; + builtin_mode_to_type[V16QImode][0] = V16QI_type_node; + builtin_mode_to_type[V16QImode][1] = unsigned_V16QI_type_node; + (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL, get_identifier ("__bool char"), bool_char_type_node)); @@ -9144,6 +10933,7 @@ rs6000_init_builtins (void) bool_V16QI_type_node = build_vector_type (bool_char_type_node, 16); bool_V8HI_type_node = build_vector_type (bool_short_type_node, 8); bool_V4SI_type_node = build_vector_type (bool_int_type_node, 4); + bool_V2DI_type_node = build_vector_type (bool_long_type_node, 2); pixel_V8HI_type_node = build_vector_type (pixel_type_node, 8); (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL, @@ -9183,39 +10973,68 @@ rs6000_init_builtins (void) get_identifier ("__vector __pixel"), pixel_V8HI_type_node)); + if (TARGET_VSX) + { + (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL, + get_identifier ("__vector double"), + V2DF_type_node)); + (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL, + get_identifier ("__vector long"), + V2DI_type_node)); + (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL, + get_identifier ("__vector __bool long"), + bool_V2DI_type_node)); + } + if (TARGET_PAIRED_FLOAT) paired_init_builtins (); if (TARGET_SPE) spe_init_builtins (); if (TARGET_ALTIVEC) altivec_init_builtins (); - if (TARGET_ALTIVEC || TARGET_SPE || TARGET_PAIRED_FLOAT) + if (TARGET_ALTIVEC || TARGET_SPE || TARGET_PAIRED_FLOAT || TARGET_VSX) rs6000_common_init_builtins (); if (TARGET_PPC_GFXOPT) { - tree ftype = build_function_type_list (float_type_node, - float_type_node, - float_type_node, - NULL_TREE); + tree ftype = builtin_function_type (SFmode, SFmode, SFmode, VOIDmode, + RS6000_BUILTIN_RECIPF, + "__builtin_recipdivf"); def_builtin (MASK_PPC_GFXOPT, "__builtin_recipdivf", ftype, RS6000_BUILTIN_RECIPF); - ftype = build_function_type_list (float_type_node, - float_type_node, - NULL_TREE); + ftype = builtin_function_type (SFmode, SFmode, VOIDmode, VOIDmode, + RS6000_BUILTIN_RSQRTF, + "__builtin_rsqrtf"); def_builtin (MASK_PPC_GFXOPT, "__builtin_rsqrtf", ftype, RS6000_BUILTIN_RSQRTF); } if (TARGET_POPCNTB) { - tree ftype = build_function_type_list (double_type_node, - double_type_node, - double_type_node, - NULL_TREE); + tree ftype = builtin_function_type (DFmode, DFmode, DFmode, VOIDmode, + RS6000_BUILTIN_RECIP, + "__builtin_recipdiv"); def_builtin (MASK_POPCNTB, "__builtin_recipdiv", ftype, RS6000_BUILTIN_RECIP); } + if (TARGET_POPCNTD) + { + enum machine_mode mode = (TARGET_64BIT) ? DImode : SImode; + tree ftype = builtin_function_type (mode, mode, mode, VOIDmode, + POWER7_BUILTIN_BPERMD, + "__builtin_bpermd"); + def_builtin (MASK_POPCNTD, "__builtin_bpermd", ftype, + POWER7_BUILTIN_BPERMD); + } + if (TARGET_POWERPC) + { + /* Don't use builtin_function_type here, as it maps HI/QI to SI. */ + tree ftype = build_function_type_list (unsigned_intHI_type_node, + unsigned_intHI_type_node, + NULL_TREE); + def_builtin (MASK_POWERPC, "__builtin_bswap16", ftype, + RS6000_BUILTIN_BSWAP_HI); + } #if TARGET_XCOFF /* AIX libm provides clog as __clog. */ @@ -9644,6 +11463,10 @@ altivec_init_builtins (void) = build_function_type_list (integer_type_node, integer_type_node, V4SF_type_node, V4SF_type_node, NULL_TREE); + tree int_ftype_int_v2df_v2df + = build_function_type_list (integer_type_node, + integer_type_node, V2DF_type_node, + V2DF_type_node, NULL_TREE); tree v4si_ftype_v4si = build_function_type_list (V4SI_type_node, V4SI_type_node, NULL_TREE); tree v8hi_ftype_v8hi @@ -9652,6 +11475,8 @@ altivec_init_builtins (void) = build_function_type_list (V16QI_type_node, V16QI_type_node, NULL_TREE); tree v4sf_ftype_v4sf = build_function_type_list (V4SF_type_node, V4SF_type_node, NULL_TREE); + tree v2df_ftype_v2df + = build_function_type_list (V2DF_type_node, V2DF_type_node, NULL_TREE); tree void_ftype_pcvoid_int_int = build_function_type_list (void_type_node, pcvoid_type_node, integer_type_node, @@ -9754,8 +11579,10 @@ altivec_init_builtins (void) { enum machine_mode mode1; tree type; - bool is_overloaded = dp->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST - && dp->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST; + bool is_overloaded = ((dp->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST + && dp->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST) + || (dp->code >= VSX_BUILTIN_OVERLOADED_FIRST + && dp->code <= VSX_BUILTIN_OVERLOADED_LAST)); if (is_overloaded) mode1 = VOIDmode; @@ -9779,6 +11606,9 @@ altivec_init_builtins (void) case V4SFmode: type = int_ftype_int_v4sf_v4sf; break; + case V2DFmode: + type = int_ftype_int_v2df_v2df; + break; default: gcc_unreachable (); } @@ -9809,6 +11639,9 @@ altivec_init_builtins (void) case V4SFmode: type = v4sf_ftype_v4sf; break; + case V2DFmode: + type = v2df_ftype_v2df; + break; default: gcc_unreachable (); } @@ -9868,6 +11701,19 @@ altivec_init_builtins (void) def_builtin (MASK_ALTIVEC, "__builtin_vec_init_v4sf", ftype, ALTIVEC_BUILTIN_VEC_INIT_V4SF); + if (TARGET_VSX) + { + ftype = build_function_type_list (V2DF_type_node, double_type_node, + double_type_node, NULL_TREE); + def_builtin (MASK_VSX, "__builtin_vec_init_v2df", ftype, + VSX_BUILTIN_VEC_INIT_V2DF); + + ftype = build_function_type_list (V2DI_type_node, intDI_type_node, + intDI_type_node, NULL_TREE); + def_builtin (MASK_VSX, "__builtin_vec_init_v2di", ftype, + VSX_BUILTIN_VEC_INIT_V2DI); + } + /* Access to the vec_set patterns. */ ftype = build_function_type_list (V4SI_type_node, V4SI_type_node, intSI_type_node, @@ -9881,7 +11727,7 @@ altivec_init_builtins (void) def_builtin (MASK_ALTIVEC, "__builtin_vec_set_v8hi", ftype, ALTIVEC_BUILTIN_VEC_SET_V8HI); - ftype = build_function_type_list (V8HI_type_node, V16QI_type_node, + ftype = build_function_type_list (V16QI_type_node, V16QI_type_node, intQI_type_node, integer_type_node, NULL_TREE); def_builtin (MASK_ALTIVEC, "__builtin_vec_set_v16qi", ftype, @@ -9890,9 +11736,24 @@ altivec_init_builtins (void) ftype = build_function_type_list (V4SF_type_node, V4SF_type_node, float_type_node, integer_type_node, NULL_TREE); - def_builtin (MASK_ALTIVEC, "__builtin_vec_set_v4sf", ftype, + def_builtin (MASK_ALTIVEC|MASK_VSX, "__builtin_vec_set_v4sf", ftype, ALTIVEC_BUILTIN_VEC_SET_V4SF); + if (TARGET_VSX) + { + ftype = build_function_type_list (V2DF_type_node, V2DF_type_node, + double_type_node, + integer_type_node, NULL_TREE); + def_builtin (MASK_VSX, "__builtin_vec_set_v2df", ftype, + VSX_BUILTIN_VEC_SET_V2DF); + + ftype = build_function_type_list (V2DI_type_node, V2DI_type_node, + intDI_type_node, + integer_type_node, NULL_TREE); + def_builtin (MASK_VSX, "__builtin_vec_set_v2di", ftype, + VSX_BUILTIN_VEC_SET_V2DI); + } + /* Access to the vec_extract patterns. */ ftype = build_function_type_list (intSI_type_node, V4SI_type_node, integer_type_node, NULL_TREE); @@ -9911,539 +11772,376 @@ altivec_init_builtins (void) ftype = build_function_type_list (float_type_node, V4SF_type_node, integer_type_node, NULL_TREE); - def_builtin (MASK_ALTIVEC, "__builtin_vec_ext_v4sf", ftype, + def_builtin (MASK_ALTIVEC|MASK_VSX, "__builtin_vec_ext_v4sf", ftype, ALTIVEC_BUILTIN_VEC_EXT_V4SF); + + if (TARGET_VSX) + { + ftype = build_function_type_list (double_type_node, V2DF_type_node, + integer_type_node, NULL_TREE); + def_builtin (MASK_VSX, "__builtin_vec_ext_v2df", ftype, + VSX_BUILTIN_VEC_EXT_V2DF); + + ftype = build_function_type_list (intDI_type_node, V2DI_type_node, + integer_type_node, NULL_TREE); + def_builtin (MASK_VSX, "__builtin_vec_ext_v2di", ftype, + VSX_BUILTIN_VEC_EXT_V2DI); + } } -static void -rs6000_common_init_builtins (void) +/* Hash function for builtin functions with up to 3 arguments and a return + type. */ +static unsigned +builtin_hash_function (const void *hash_entry) { - const struct builtin_description *d; - size_t i; + unsigned ret = 0; + int i; + const struct builtin_hash_struct *bh = + (const struct builtin_hash_struct *) hash_entry; - tree v2sf_ftype_v2sf_v2sf_v2sf - = build_function_type_list (V2SF_type_node, - V2SF_type_node, V2SF_type_node, - V2SF_type_node, NULL_TREE); - - tree v4sf_ftype_v4sf_v4sf_v16qi - = build_function_type_list (V4SF_type_node, - V4SF_type_node, V4SF_type_node, - V16QI_type_node, NULL_TREE); - tree v4si_ftype_v4si_v4si_v16qi - = build_function_type_list (V4SI_type_node, - V4SI_type_node, V4SI_type_node, - V16QI_type_node, NULL_TREE); - tree v8hi_ftype_v8hi_v8hi_v16qi - = build_function_type_list (V8HI_type_node, - V8HI_type_node, V8HI_type_node, - V16QI_type_node, NULL_TREE); - tree v16qi_ftype_v16qi_v16qi_v16qi - = build_function_type_list (V16QI_type_node, - V16QI_type_node, V16QI_type_node, - V16QI_type_node, NULL_TREE); - tree v4si_ftype_int - = build_function_type_list (V4SI_type_node, integer_type_node, NULL_TREE); - tree v8hi_ftype_int - = build_function_type_list (V8HI_type_node, integer_type_node, NULL_TREE); - tree v16qi_ftype_int - = build_function_type_list (V16QI_type_node, integer_type_node, NULL_TREE); - tree v8hi_ftype_v16qi - = build_function_type_list (V8HI_type_node, V16QI_type_node, NULL_TREE); - tree v4sf_ftype_v4sf - = build_function_type_list (V4SF_type_node, V4SF_type_node, NULL_TREE); + for (i = 0; i < 4; i++) + { + ret = (ret * (unsigned)MAX_MACHINE_MODE) + ((unsigned)bh->mode[i]); + ret = (ret * 2) + bh->uns_p[i]; + } - tree v2si_ftype_v2si_v2si - = build_function_type_list (opaque_V2SI_type_node, - opaque_V2SI_type_node, - opaque_V2SI_type_node, NULL_TREE); - - tree v2sf_ftype_v2sf_v2sf_spe - = build_function_type_list (opaque_V2SF_type_node, - opaque_V2SF_type_node, - opaque_V2SF_type_node, NULL_TREE); - - tree v2sf_ftype_v2sf_v2sf - = build_function_type_list (V2SF_type_node, - V2SF_type_node, - V2SF_type_node, NULL_TREE); - - - tree v2si_ftype_int_int - = build_function_type_list (opaque_V2SI_type_node, - integer_type_node, integer_type_node, - NULL_TREE); + return ret; +} - tree opaque_ftype_opaque - = build_function_type_list (opaque_V4SI_type_node, - opaque_V4SI_type_node, NULL_TREE); +/* Compare builtin hash entries H1 and H2 for equivalence. */ +static int +builtin_hash_eq (const void *h1, const void *h2) +{ + const struct builtin_hash_struct *p1 = (const struct builtin_hash_struct *) h1; + const struct builtin_hash_struct *p2 = (const struct builtin_hash_struct *) h2; + + return ((p1->mode[0] == p2->mode[0]) + && (p1->mode[1] == p2->mode[1]) + && (p1->mode[2] == p2->mode[2]) + && (p1->mode[3] == p2->mode[3]) + && (p1->uns_p[0] == p2->uns_p[0]) + && (p1->uns_p[1] == p2->uns_p[1]) + && (p1->uns_p[2] == p2->uns_p[2]) + && (p1->uns_p[3] == p2->uns_p[3])); +} + +/* Map types for builtin functions with an explicit return type and up to 3 + arguments. Functions with fewer than 3 arguments use VOIDmode as the type + of the argument. */ +static tree +builtin_function_type (enum machine_mode mode_ret, enum machine_mode mode_arg0, + enum machine_mode mode_arg1, enum machine_mode mode_arg2, + enum rs6000_builtins builtin, const char *name) +{ + struct builtin_hash_struct h; + struct builtin_hash_struct *h2; + void **found; + int num_args = 3; + int i; + tree ret_type = NULL_TREE; + tree arg_type[3] = { NULL_TREE, NULL_TREE, NULL_TREE }; + tree args; + + /* Create builtin_hash_table. */ + if (builtin_hash_table == NULL) + builtin_hash_table = htab_create_ggc (1500, builtin_hash_function, + builtin_hash_eq, NULL); + + h.type = NULL_TREE; + h.mode[0] = mode_ret; + h.mode[1] = mode_arg0; + h.mode[2] = mode_arg1; + h.mode[3] = mode_arg2; + h.uns_p[0] = 0; + h.uns_p[1] = 0; + h.uns_p[2] = 0; + h.uns_p[3] = 0; + + /* If the builtin is a type that produces unsigned results or takes unsigned + arguments, and it is returned as a decl for the vectorizer (such as + widening multiplies, permute), make sure the arguments and return value + are type correct. */ + switch (builtin) + { + /* unsigned 2 argument functions. */ + case ALTIVEC_BUILTIN_VMULEUB_UNS: + case ALTIVEC_BUILTIN_VMULEUH_UNS: + case ALTIVEC_BUILTIN_VMULOUB_UNS: + case ALTIVEC_BUILTIN_VMULOUH_UNS: + h.uns_p[0] = 1; + h.uns_p[1] = 1; + h.uns_p[2] = 1; + break; + + /* unsigned 3 argument functions. */ + case ALTIVEC_BUILTIN_VPERM_16QI_UNS: + case ALTIVEC_BUILTIN_VPERM_8HI_UNS: + case ALTIVEC_BUILTIN_VPERM_4SI_UNS: + case ALTIVEC_BUILTIN_VPERM_2DI_UNS: + case ALTIVEC_BUILTIN_VSEL_16QI_UNS: + case ALTIVEC_BUILTIN_VSEL_8HI_UNS: + case ALTIVEC_BUILTIN_VSEL_4SI_UNS: + case ALTIVEC_BUILTIN_VSEL_2DI_UNS: + case VSX_BUILTIN_VPERM_16QI_UNS: + case VSX_BUILTIN_VPERM_8HI_UNS: + case VSX_BUILTIN_VPERM_4SI_UNS: + case VSX_BUILTIN_VPERM_2DI_UNS: + case VSX_BUILTIN_XXSEL_16QI_UNS: + case VSX_BUILTIN_XXSEL_8HI_UNS: + case VSX_BUILTIN_XXSEL_4SI_UNS: + case VSX_BUILTIN_XXSEL_2DI_UNS: + h.uns_p[0] = 1; + h.uns_p[1] = 1; + h.uns_p[2] = 1; + h.uns_p[3] = 1; + break; + + /* signed permute functions with unsigned char mask. */ + case ALTIVEC_BUILTIN_VPERM_16QI: + case ALTIVEC_BUILTIN_VPERM_8HI: + case ALTIVEC_BUILTIN_VPERM_4SI: + case ALTIVEC_BUILTIN_VPERM_4SF: + case ALTIVEC_BUILTIN_VPERM_2DI: + case ALTIVEC_BUILTIN_VPERM_2DF: + case VSX_BUILTIN_VPERM_16QI: + case VSX_BUILTIN_VPERM_8HI: + case VSX_BUILTIN_VPERM_4SI: + case VSX_BUILTIN_VPERM_4SF: + case VSX_BUILTIN_VPERM_2DI: + case VSX_BUILTIN_VPERM_2DF: + h.uns_p[3] = 1; + break; + + /* unsigned args, signed return. */ + case VSX_BUILTIN_XVCVUXDDP_UNS: + case VECTOR_BUILTIN_UNSFLOAT_V4SI_V4SF: + h.uns_p[1] = 1; + break; + + /* signed args, unsigned return. */ + case VSX_BUILTIN_XVCVDPUXDS_UNS: + case VECTOR_BUILTIN_FIXUNS_V4SF_V4SI: + h.uns_p[0] = 1; + break; + + default: + break; + } + + /* Figure out how many args are present. */ + while (num_args > 0 && h.mode[num_args] == VOIDmode) + num_args--; + + if (num_args == 0) + fatal_error ("internal error: builtin function %s had no type", name); + + ret_type = builtin_mode_to_type[h.mode[0]][h.uns_p[0]]; + if (!ret_type && h.uns_p[0]) + ret_type = builtin_mode_to_type[h.mode[0]][0]; + + if (!ret_type) + fatal_error ("internal error: builtin function %s had an unexpected " + "return type %s", name, GET_MODE_NAME (h.mode[0])); + + for (i = 0; i < num_args; i++) + { + int m = (int) h.mode[i+1]; + int uns_p = h.uns_p[i+1]; + + arg_type[i] = builtin_mode_to_type[m][uns_p]; + if (!arg_type[i] && uns_p) + arg_type[i] = builtin_mode_to_type[m][0]; + + if (!arg_type[i]) + fatal_error ("internal error: builtin function %s, argument %d " + "had unexpected argument type %s", name, i, + GET_MODE_NAME (m)); + } + + found = htab_find_slot (builtin_hash_table, &h, INSERT); + if (*found == NULL) + { + h2 = GGC_NEW (struct builtin_hash_struct); + *h2 = h; + *found = (void *)h2; + args = void_list_node; + + for (i = num_args - 1; i >= 0; i--) + args = tree_cons (NULL_TREE, arg_type[i], args); + + h2->type = build_function_type (ret_type, args); + } + + return ((struct builtin_hash_struct *)(*found))->type; +} - tree v2si_ftype_v2si - = build_function_type_list (opaque_V2SI_type_node, - opaque_V2SI_type_node, NULL_TREE); - - tree v2sf_ftype_v2sf_spe - = build_function_type_list (opaque_V2SF_type_node, - opaque_V2SF_type_node, NULL_TREE); - - tree v2sf_ftype_v2sf - = build_function_type_list (V2SF_type_node, - V2SF_type_node, NULL_TREE); - - tree v2sf_ftype_v2si - = build_function_type_list (opaque_V2SF_type_node, - opaque_V2SI_type_node, NULL_TREE); - - tree v2si_ftype_v2sf - = build_function_type_list (opaque_V2SI_type_node, - opaque_V2SF_type_node, NULL_TREE); - - tree v2si_ftype_v2si_char - = build_function_type_list (opaque_V2SI_type_node, - opaque_V2SI_type_node, - char_type_node, NULL_TREE); - - tree v2si_ftype_int_char - = build_function_type_list (opaque_V2SI_type_node, - integer_type_node, char_type_node, NULL_TREE); - - tree v2si_ftype_char - = build_function_type_list (opaque_V2SI_type_node, - char_type_node, NULL_TREE); +static void +rs6000_common_init_builtins (void) +{ + const struct builtin_description *d; + size_t i; - tree int_ftype_int_int - = build_function_type_list (integer_type_node, - integer_type_node, integer_type_node, - NULL_TREE); + tree opaque_ftype_opaque = NULL_TREE; + tree opaque_ftype_opaque_opaque = NULL_TREE; + tree opaque_ftype_opaque_opaque_opaque = NULL_TREE; + tree v2si_ftype_qi = NULL_TREE; + tree v2si_ftype_v2si_qi = NULL_TREE; + tree v2si_ftype_int_qi = NULL_TREE; - tree opaque_ftype_opaque_opaque - = build_function_type_list (opaque_V4SI_type_node, - opaque_V4SI_type_node, opaque_V4SI_type_node, NULL_TREE); - tree v4si_ftype_v4si_v4si - = build_function_type_list (V4SI_type_node, - V4SI_type_node, V4SI_type_node, NULL_TREE); - tree v4sf_ftype_v4si_int - = build_function_type_list (V4SF_type_node, - V4SI_type_node, integer_type_node, NULL_TREE); - tree v4si_ftype_v4sf_int - = build_function_type_list (V4SI_type_node, - V4SF_type_node, integer_type_node, NULL_TREE); - tree v4si_ftype_v4si_int - = build_function_type_list (V4SI_type_node, - V4SI_type_node, integer_type_node, NULL_TREE); - tree v8hi_ftype_v8hi_int - = build_function_type_list (V8HI_type_node, - V8HI_type_node, integer_type_node, NULL_TREE); - tree v16qi_ftype_v16qi_int - = build_function_type_list (V16QI_type_node, - V16QI_type_node, integer_type_node, NULL_TREE); - tree v16qi_ftype_v16qi_v16qi_int - = build_function_type_list (V16QI_type_node, - V16QI_type_node, V16QI_type_node, - integer_type_node, NULL_TREE); - tree v8hi_ftype_v8hi_v8hi_int - = build_function_type_list (V8HI_type_node, - V8HI_type_node, V8HI_type_node, - integer_type_node, NULL_TREE); - tree v4si_ftype_v4si_v4si_int - = build_function_type_list (V4SI_type_node, - V4SI_type_node, V4SI_type_node, - integer_type_node, NULL_TREE); - tree v4sf_ftype_v4sf_v4sf_int - = build_function_type_list (V4SF_type_node, - V4SF_type_node, V4SF_type_node, - integer_type_node, NULL_TREE); - tree v4sf_ftype_v4sf_v4sf - = build_function_type_list (V4SF_type_node, - V4SF_type_node, V4SF_type_node, NULL_TREE); - tree opaque_ftype_opaque_opaque_opaque - = build_function_type_list (opaque_V4SI_type_node, - opaque_V4SI_type_node, opaque_V4SI_type_node, - opaque_V4SI_type_node, NULL_TREE); - tree v4sf_ftype_v4sf_v4sf_v4si - = build_function_type_list (V4SF_type_node, - V4SF_type_node, V4SF_type_node, - V4SI_type_node, NULL_TREE); - tree v4sf_ftype_v4sf_v4sf_v4sf - = build_function_type_list (V4SF_type_node, - V4SF_type_node, V4SF_type_node, - V4SF_type_node, NULL_TREE); - tree v4si_ftype_v4si_v4si_v4si - = build_function_type_list (V4SI_type_node, - V4SI_type_node, V4SI_type_node, - V4SI_type_node, NULL_TREE); - tree v8hi_ftype_v8hi_v8hi - = build_function_type_list (V8HI_type_node, - V8HI_type_node, V8HI_type_node, NULL_TREE); - tree v8hi_ftype_v8hi_v8hi_v8hi - = build_function_type_list (V8HI_type_node, - V8HI_type_node, V8HI_type_node, - V8HI_type_node, NULL_TREE); - tree v4si_ftype_v8hi_v8hi_v4si - = build_function_type_list (V4SI_type_node, - V8HI_type_node, V8HI_type_node, - V4SI_type_node, NULL_TREE); - tree v4si_ftype_v16qi_v16qi_v4si - = build_function_type_list (V4SI_type_node, - V16QI_type_node, V16QI_type_node, - V4SI_type_node, NULL_TREE); - tree v16qi_ftype_v16qi_v16qi - = build_function_type_list (V16QI_type_node, - V16QI_type_node, V16QI_type_node, NULL_TREE); - tree v4si_ftype_v4sf_v4sf - = build_function_type_list (V4SI_type_node, - V4SF_type_node, V4SF_type_node, NULL_TREE); - tree v8hi_ftype_v16qi_v16qi - = build_function_type_list (V8HI_type_node, - V16QI_type_node, V16QI_type_node, NULL_TREE); - tree v4si_ftype_v8hi_v8hi - = build_function_type_list (V4SI_type_node, - V8HI_type_node, V8HI_type_node, NULL_TREE); - tree v8hi_ftype_v4si_v4si - = build_function_type_list (V8HI_type_node, - V4SI_type_node, V4SI_type_node, NULL_TREE); - tree v16qi_ftype_v8hi_v8hi - = build_function_type_list (V16QI_type_node, - V8HI_type_node, V8HI_type_node, NULL_TREE); - tree v4si_ftype_v16qi_v4si - = build_function_type_list (V4SI_type_node, - V16QI_type_node, V4SI_type_node, NULL_TREE); - tree v4si_ftype_v16qi_v16qi - = build_function_type_list (V4SI_type_node, - V16QI_type_node, V16QI_type_node, NULL_TREE); - tree v4si_ftype_v8hi_v4si - = build_function_type_list (V4SI_type_node, - V8HI_type_node, V4SI_type_node, NULL_TREE); - tree v4si_ftype_v8hi - = build_function_type_list (V4SI_type_node, V8HI_type_node, NULL_TREE); - tree int_ftype_v4si_v4si - = build_function_type_list (integer_type_node, - V4SI_type_node, V4SI_type_node, NULL_TREE); - tree int_ftype_v4sf_v4sf - = build_function_type_list (integer_type_node, - V4SF_type_node, V4SF_type_node, NULL_TREE); - tree int_ftype_v16qi_v16qi - = build_function_type_list (integer_type_node, - V16QI_type_node, V16QI_type_node, NULL_TREE); - tree int_ftype_v8hi_v8hi - = build_function_type_list (integer_type_node, - V8HI_type_node, V8HI_type_node, NULL_TREE); + if (!TARGET_PAIRED_FLOAT) + { + builtin_mode_to_type[V2SImode][0] = opaque_V2SI_type_node; + builtin_mode_to_type[V2SFmode][0] = opaque_V2SF_type_node; + } - /* Add the simple ternary operators. */ + /* Add the ternary operators. */ d = bdesc_3arg; for (i = 0; i < ARRAY_SIZE (bdesc_3arg); i++, d++) { - enum machine_mode mode0, mode1, mode2, mode3; tree type; - bool is_overloaded = d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST - && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST; + int mask = d->mask; - if (is_overloaded) - { - mode0 = VOIDmode; - mode1 = VOIDmode; - mode2 = VOIDmode; - mode3 = VOIDmode; + if ((mask != 0 && (mask & target_flags) == 0) + || (mask == 0 && !TARGET_PAIRED_FLOAT)) + continue; + + if ((d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST + && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST) + || (d->code >= VSX_BUILTIN_OVERLOADED_FIRST + && d->code <= VSX_BUILTIN_OVERLOADED_LAST)) + { + if (! (type = opaque_ftype_opaque_opaque_opaque)) + type = opaque_ftype_opaque_opaque_opaque + = build_function_type_list (opaque_V4SI_type_node, + opaque_V4SI_type_node, + opaque_V4SI_type_node, + opaque_V4SI_type_node, + NULL_TREE); } else { - if (d->name == 0 || d->icode == CODE_FOR_nothing) + enum insn_code icode = d->icode; + if (d->name == 0 || icode == CODE_FOR_nothing) continue; - mode0 = insn_data[d->icode].operand[0].mode; - mode1 = insn_data[d->icode].operand[1].mode; - mode2 = insn_data[d->icode].operand[2].mode; - mode3 = insn_data[d->icode].operand[3].mode; - } - - /* When all four are of the same mode. */ - if (mode0 == mode1 && mode1 == mode2 && mode2 == mode3) - { - switch (mode0) - { - case VOIDmode: - type = opaque_ftype_opaque_opaque_opaque; - break; - case V4SImode: - type = v4si_ftype_v4si_v4si_v4si; - break; - case V4SFmode: - type = v4sf_ftype_v4sf_v4sf_v4sf; - break; - case V8HImode: - type = v8hi_ftype_v8hi_v8hi_v8hi; - break; - case V16QImode: - type = v16qi_ftype_v16qi_v16qi_v16qi; - break; - case V2SFmode: - type = v2sf_ftype_v2sf_v2sf_v2sf; - break; - default: - gcc_unreachable (); - } - } - else if (mode0 == mode1 && mode1 == mode2 && mode3 == V16QImode) - { - switch (mode0) - { - case V4SImode: - type = v4si_ftype_v4si_v4si_v16qi; - break; - case V4SFmode: - type = v4sf_ftype_v4sf_v4sf_v16qi; - break; - case V8HImode: - type = v8hi_ftype_v8hi_v8hi_v16qi; - break; - case V16QImode: - type = v16qi_ftype_v16qi_v16qi_v16qi; - break; - default: - gcc_unreachable (); - } + type = builtin_function_type (insn_data[icode].operand[0].mode, + insn_data[icode].operand[1].mode, + insn_data[icode].operand[2].mode, + insn_data[icode].operand[3].mode, + d->code, d->name); } - else if (mode0 == V4SImode && mode1 == V16QImode && mode2 == V16QImode - && mode3 == V4SImode) - type = v4si_ftype_v16qi_v16qi_v4si; - else if (mode0 == V4SImode && mode1 == V8HImode && mode2 == V8HImode - && mode3 == V4SImode) - type = v4si_ftype_v8hi_v8hi_v4si; - else if (mode0 == V4SFmode && mode1 == V4SFmode && mode2 == V4SFmode - && mode3 == V4SImode) - type = v4sf_ftype_v4sf_v4sf_v4si; - - /* vchar, vchar, vchar, 4-bit literal. */ - else if (mode0 == V16QImode && mode1 == mode0 && mode2 == mode0 - && mode3 == QImode) - type = v16qi_ftype_v16qi_v16qi_int; - - /* vshort, vshort, vshort, 4-bit literal. */ - else if (mode0 == V8HImode && mode1 == mode0 && mode2 == mode0 - && mode3 == QImode) - type = v8hi_ftype_v8hi_v8hi_int; - - /* vint, vint, vint, 4-bit literal. */ - else if (mode0 == V4SImode && mode1 == mode0 && mode2 == mode0 - && mode3 == QImode) - type = v4si_ftype_v4si_v4si_int; - - /* vfloat, vfloat, vfloat, 4-bit literal. */ - else if (mode0 == V4SFmode && mode1 == mode0 && mode2 == mode0 - && mode3 == QImode) - type = v4sf_ftype_v4sf_v4sf_int; - - else - gcc_unreachable (); def_builtin (d->mask, d->name, type, d->code); } - /* Add the simple binary operators. */ + /* Add the binary operators. */ d = (struct builtin_description *) bdesc_2arg; for (i = 0; i < ARRAY_SIZE (bdesc_2arg); i++, d++) { enum machine_mode mode0, mode1, mode2; tree type; - bool is_overloaded = d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST - && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST; + int mask = d->mask; - if (is_overloaded) - { - mode0 = VOIDmode; - mode1 = VOIDmode; - mode2 = VOIDmode; + if ((mask != 0 && (mask & target_flags) == 0) + || (mask == 0 && !TARGET_PAIRED_FLOAT)) + continue; + + if ((d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST + && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST) + || (d->code >= VSX_BUILTIN_OVERLOADED_FIRST + && d->code <= VSX_BUILTIN_OVERLOADED_LAST)) + { + if (! (type = opaque_ftype_opaque_opaque)) + type = opaque_ftype_opaque_opaque + = build_function_type_list (opaque_V4SI_type_node, + opaque_V4SI_type_node, + opaque_V4SI_type_node, + NULL_TREE); } else { - if (d->name == 0 || d->icode == CODE_FOR_nothing) + enum insn_code icode = d->icode; + if (d->name == 0 || icode == CODE_FOR_nothing) continue; - mode0 = insn_data[d->icode].operand[0].mode; - mode1 = insn_data[d->icode].operand[1].mode; - mode2 = insn_data[d->icode].operand[2].mode; - } + mode0 = insn_data[icode].operand[0].mode; + mode1 = insn_data[icode].operand[1].mode; + mode2 = insn_data[icode].operand[2].mode; - /* When all three operands are of the same mode. */ - if (mode0 == mode1 && mode1 == mode2) - { - switch (mode0) + if (mode0 == V2SImode && mode1 == V2SImode && mode2 == QImode) { - case VOIDmode: - type = opaque_ftype_opaque_opaque; - break; - case V4SFmode: - type = v4sf_ftype_v4sf_v4sf; - break; - case V4SImode: - type = v4si_ftype_v4si_v4si; - break; - case V16QImode: - type = v16qi_ftype_v16qi_v16qi; - break; - case V8HImode: - type = v8hi_ftype_v8hi_v8hi; - break; - case V2SImode: - type = v2si_ftype_v2si_v2si; - break; - case V2SFmode: - if (TARGET_PAIRED_FLOAT) - type = v2sf_ftype_v2sf_v2sf; - else - type = v2sf_ftype_v2sf_v2sf_spe; - break; - case SImode: - type = int_ftype_int_int; - break; - default: - gcc_unreachable (); + if (! (type = v2si_ftype_v2si_qi)) + type = v2si_ftype_v2si_qi + = build_function_type_list (opaque_V2SI_type_node, + opaque_V2SI_type_node, + char_type_node, + NULL_TREE); } - } - - /* A few other combos we really don't want to do manually. */ - - /* vint, vfloat, vfloat. */ - else if (mode0 == V4SImode && mode1 == V4SFmode && mode2 == V4SFmode) - type = v4si_ftype_v4sf_v4sf; - - /* vshort, vchar, vchar. */ - else if (mode0 == V8HImode && mode1 == V16QImode && mode2 == V16QImode) - type = v8hi_ftype_v16qi_v16qi; - - /* vint, vshort, vshort. */ - else if (mode0 == V4SImode && mode1 == V8HImode && mode2 == V8HImode) - type = v4si_ftype_v8hi_v8hi; - - /* vshort, vint, vint. */ - else if (mode0 == V8HImode && mode1 == V4SImode && mode2 == V4SImode) - type = v8hi_ftype_v4si_v4si; - /* vchar, vshort, vshort. */ - else if (mode0 == V16QImode && mode1 == V8HImode && mode2 == V8HImode) - type = v16qi_ftype_v8hi_v8hi; - - /* vint, vchar, vint. */ - else if (mode0 == V4SImode && mode1 == V16QImode && mode2 == V4SImode) - type = v4si_ftype_v16qi_v4si; - - /* vint, vchar, vchar. */ - else if (mode0 == V4SImode && mode1 == V16QImode && mode2 == V16QImode) - type = v4si_ftype_v16qi_v16qi; - - /* vint, vshort, vint. */ - else if (mode0 == V4SImode && mode1 == V8HImode && mode2 == V4SImode) - type = v4si_ftype_v8hi_v4si; - - /* vint, vint, 5-bit literal. */ - else if (mode0 == V4SImode && mode1 == V4SImode && mode2 == QImode) - type = v4si_ftype_v4si_int; - - /* vshort, vshort, 5-bit literal. */ - else if (mode0 == V8HImode && mode1 == V8HImode && mode2 == QImode) - type = v8hi_ftype_v8hi_int; - - /* vchar, vchar, 5-bit literal. */ - else if (mode0 == V16QImode && mode1 == V16QImode && mode2 == QImode) - type = v16qi_ftype_v16qi_int; - - /* vfloat, vint, 5-bit literal. */ - else if (mode0 == V4SFmode && mode1 == V4SImode && mode2 == QImode) - type = v4sf_ftype_v4si_int; - - /* vint, vfloat, 5-bit literal. */ - else if (mode0 == V4SImode && mode1 == V4SFmode && mode2 == QImode) - type = v4si_ftype_v4sf_int; - - else if (mode0 == V2SImode && mode1 == SImode && mode2 == SImode) - type = v2si_ftype_int_int; - - else if (mode0 == V2SImode && mode1 == V2SImode && mode2 == QImode) - type = v2si_ftype_v2si_char; - - else if (mode0 == V2SImode && mode1 == SImode && mode2 == QImode) - type = v2si_ftype_int_char; - - else - { - /* int, x, x. */ - gcc_assert (mode0 == SImode); - switch (mode1) + else if (mode0 == V2SImode && GET_MODE_CLASS (mode1) == MODE_INT + && mode2 == QImode) { - case V4SImode: - type = int_ftype_v4si_v4si; - break; - case V4SFmode: - type = int_ftype_v4sf_v4sf; - break; - case V16QImode: - type = int_ftype_v16qi_v16qi; - break; - case V8HImode: - type = int_ftype_v8hi_v8hi; - break; - default: - gcc_unreachable (); + if (! (type = v2si_ftype_int_qi)) + type = v2si_ftype_int_qi + = build_function_type_list (opaque_V2SI_type_node, + integer_type_node, + char_type_node, + NULL_TREE); } + + else + type = builtin_function_type (mode0, mode1, mode2, VOIDmode, + d->code, d->name); } def_builtin (d->mask, d->name, type, d->code); } - /* Add the simple unary operators. */ + /* Add the unary operators. */ d = (struct builtin_description *) bdesc_1arg; for (i = 0; i < ARRAY_SIZE (bdesc_1arg); i++, d++) { enum machine_mode mode0, mode1; tree type; - bool is_overloaded = d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST - && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST; + int mask = d->mask; - if (is_overloaded) - { - mode0 = VOIDmode; - mode1 = VOIDmode; - } + if ((mask != 0 && (mask & target_flags) == 0) + || (mask == 0 && !TARGET_PAIRED_FLOAT)) + continue; + + if ((d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST + && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST) + || (d->code >= VSX_BUILTIN_OVERLOADED_FIRST + && d->code <= VSX_BUILTIN_OVERLOADED_LAST)) + { + if (! (type = opaque_ftype_opaque)) + type = opaque_ftype_opaque + = build_function_type_list (opaque_V4SI_type_node, + opaque_V4SI_type_node, + NULL_TREE); + } else { - if (d->name == 0 || d->icode == CODE_FOR_nothing) + enum insn_code icode = d->icode; + if (d->name == 0 || icode == CODE_FOR_nothing) continue; - mode0 = insn_data[d->icode].operand[0].mode; - mode1 = insn_data[d->icode].operand[1].mode; - } + mode0 = insn_data[icode].operand[0].mode; + mode1 = insn_data[icode].operand[1].mode; - if (mode0 == V4SImode && mode1 == QImode) - type = v4si_ftype_int; - else if (mode0 == V8HImode && mode1 == QImode) - type = v8hi_ftype_int; - else if (mode0 == V16QImode && mode1 == QImode) - type = v16qi_ftype_int; - else if (mode0 == VOIDmode && mode1 == VOIDmode) - type = opaque_ftype_opaque; - else if (mode0 == V4SFmode && mode1 == V4SFmode) - type = v4sf_ftype_v4sf; - else if (mode0 == V8HImode && mode1 == V16QImode) - type = v8hi_ftype_v16qi; - else if (mode0 == V4SImode && mode1 == V8HImode) - type = v4si_ftype_v8hi; - else if (mode0 == V2SImode && mode1 == V2SImode) - type = v2si_ftype_v2si; - else if (mode0 == V2SFmode && mode1 == V2SFmode) - { - if (TARGET_PAIRED_FLOAT) - type = v2sf_ftype_v2sf; - else - type = v2sf_ftype_v2sf_spe; - } - else if (mode0 == V2SFmode && mode1 == V2SImode) - type = v2sf_ftype_v2si; - else if (mode0 == V2SImode && mode1 == V2SFmode) - type = v2si_ftype_v2sf; - else if (mode0 == V2SImode && mode1 == QImode) - type = v2si_ftype_char; - else - gcc_unreachable (); + if (mode0 == V2SImode && mode1 == QImode) + { + if (! (type = v2si_ftype_qi)) + type = v2si_ftype_qi + = build_function_type_list (opaque_V2SI_type_node, + char_type_node, + NULL_TREE); + } + + else + type = builtin_function_type (mode0, mode1, VOIDmode, VOIDmode, + d->code, d->name); + } def_builtin (d->mask, d->name, type, d->code); } @@ -11201,77 +12899,510 @@ mems_ok_for_quad_peep (rtx mem1, rtx mem return 0; else { - reg2 = REGNO (addr2); - /* This was a simple (mem (reg)) expression. Offset is 0. */ - offset2 = 0; - } + reg2 = REGNO (addr2); + /* This was a simple (mem (reg)) expression. Offset is 0. */ + offset2 = 0; + } + + /* Both of these must have the same base register. */ + if (reg1 != reg2) + return 0; + + /* The offset for the second addr must be 8 more than the first addr. */ + if (offset2 != offset1 + 8) + return 0; + + /* All the tests passed. addr1 and addr2 are valid for lfq or stfq + instructions. */ + return 1; +} + + +rtx +rs6000_secondary_memory_needed_rtx (enum machine_mode mode) +{ + static bool eliminated = false; + rtx ret; + + if (mode != SDmode) + ret = assign_stack_local (mode, GET_MODE_SIZE (mode), 0); + else + { + rtx mem = cfun->machine->sdmode_stack_slot; + gcc_assert (mem != NULL_RTX); + + if (!eliminated) + { + mem = eliminate_regs (mem, VOIDmode, NULL_RTX); + cfun->machine->sdmode_stack_slot = mem; + eliminated = true; + } + ret = mem; + } + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, "\nrs6000_secondary_memory_needed_rtx, mode %s, rtx:\n", + GET_MODE_NAME (mode)); + if (!ret) + fprintf (stderr, "\tNULL_RTX\n"); + else + debug_rtx (ret); + } + + return ret; +} + +static tree +rs6000_check_sdmode (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED) +{ + /* Don't walk into types. */ + if (*tp == NULL_TREE || *tp == error_mark_node || TYPE_P (*tp)) + { + *walk_subtrees = 0; + return NULL_TREE; + } + + switch (TREE_CODE (*tp)) + { + case VAR_DECL: + case PARM_DECL: + case FIELD_DECL: + case RESULT_DECL: + case REAL_CST: + case INDIRECT_REF: + case ALIGN_INDIRECT_REF: + case MISALIGNED_INDIRECT_REF: + case VIEW_CONVERT_EXPR: + if (TYPE_MODE (TREE_TYPE (*tp)) == SDmode) + return *tp; + break; + default: + break; + } + + return NULL_TREE; +} + +enum reload_reg_type { + GPR_REGISTER_TYPE, + VECTOR_REGISTER_TYPE, + OTHER_REGISTER_TYPE +}; + +static enum reload_reg_type +rs6000_reload_register_type (enum reg_class rclass) +{ + switch (rclass) + { + case GENERAL_REGS: + case BASE_REGS: + return GPR_REGISTER_TYPE; + + case FLOAT_REGS: + case ALTIVEC_REGS: + case VSX_REGS: + return VECTOR_REGISTER_TYPE; + + default: + return OTHER_REGISTER_TYPE; + } +} + +/* Inform reload about cases where moving X with a mode MODE to a register in + RCLASS requires an extra scratch or immediate register. Return the class + needed for the immediate register. + + For VSX and Altivec, we may need a register to convert sp+offset into + reg+sp. */ + +static enum reg_class +rs6000_secondary_reload (bool in_p, + rtx x, + enum reg_class rclass, + enum machine_mode mode, + secondary_reload_info *sri) +{ + enum reg_class ret = ALL_REGS; + enum insn_code icode; + bool default_p = false; + + sri->icode = CODE_FOR_nothing; + + /* Convert vector loads and stores into gprs to use an additional base + register. */ + icode = rs6000_vector_reload[mode][in_p != false]; + if (icode != CODE_FOR_nothing) + { + ret = NO_REGS; + sri->icode = CODE_FOR_nothing; + sri->extra_cost = 0; + + if (GET_CODE (x) == MEM) + { + rtx addr = XEXP (x, 0); + + /* Loads to and stores from gprs can do reg+offset, and wouldn't need + an extra register in that case, but it would need an extra + register if the addressing is reg+reg or (reg+reg)&(-16). */ + if (rclass == GENERAL_REGS || rclass == BASE_REGS) + { + if (!legitimate_indirect_address_p (addr, false) + && !rs6000_legitimate_offset_address_p (TImode, addr, false)) + { + sri->icode = icode; + /* account for splitting the loads, and converting the + address from reg+reg to reg. */ + sri->extra_cost = (((TARGET_64BIT) ? 3 : 5) + + ((GET_CODE (addr) == AND) ? 1 : 0)); + } + } + /* Loads to and stores from vector registers can only do reg+reg + addressing. Altivec registers can also do (reg+reg)&(-16). */ + else if (rclass == VSX_REGS || rclass == ALTIVEC_REGS + || rclass == FLOAT_REGS || rclass == NO_REGS) + { + if (!VECTOR_MEM_ALTIVEC_P (mode) + && GET_CODE (addr) == AND + && GET_CODE (XEXP (addr, 1)) == CONST_INT + && INTVAL (XEXP (addr, 1)) == -16 + && (legitimate_indirect_address_p (XEXP (addr, 0), false) + || legitimate_indexed_address_p (XEXP (addr, 0), false))) + { + sri->icode = icode; + sri->extra_cost = ((GET_CODE (XEXP (addr, 0)) == PLUS) + ? 2 : 1); + } + else if (!legitimate_indirect_address_p (addr, false) + && (rclass == NO_REGS + || !legitimate_indexed_address_p (addr, false))) + { + sri->icode = icode; + sri->extra_cost = 1; + } + else + icode = CODE_FOR_nothing; + } + /* Any other loads, including to pseudo registers which haven't been + assigned to a register yet, default to require a scratch + register. */ + else + { + sri->icode = icode; + sri->extra_cost = 2; + } + } + else if (REG_P (x)) + { + int regno = true_regnum (x); + + icode = CODE_FOR_nothing; + if (regno < 0 || regno >= FIRST_PSEUDO_REGISTER) + default_p = true; + else + { + enum reg_class xclass = REGNO_REG_CLASS (regno); + enum reload_reg_type rtype1 = rs6000_reload_register_type (rclass); + enum reload_reg_type rtype2 = rs6000_reload_register_type (xclass); + + /* If memory is needed, use default_secondary_reload to create the + stack slot. */ + if (rtype1 != rtype2 || rtype1 == OTHER_REGISTER_TYPE) + default_p = true; + else + ret = NO_REGS; + } + } + else + default_p = true; + } + else + default_p = true; + + if (default_p) + ret = default_secondary_reload (in_p, x, rclass, mode, sri); + + gcc_assert (ret != ALL_REGS); + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, + "\nrs6000_secondary_reload, return %s, in_p = %s, rclass = %s, " + "mode = %s", + reg_class_names[ret], + in_p ? "true" : "false", + reg_class_names[rclass], + GET_MODE_NAME (mode)); - /* Both of these must have the same base register. */ - if (reg1 != reg2) - return 0; + if (default_p) + fprintf (stderr, ", default secondary reload"); - /* The offset for the second addr must be 8 more than the first addr. */ - if (offset2 != offset1 + 8) - return 0; + if (sri->icode != CODE_FOR_nothing) + fprintf (stderr, ", reload func = %s, extra cost = %d\n", + insn_data[sri->icode].name, sri->extra_cost); + else + fprintf (stderr, "\n"); - /* All the tests passed. addr1 and addr2 are valid for lfq or stfq - instructions. */ - return 1; + debug_rtx (x); + } + + return ret; } - -rtx -rs6000_secondary_memory_needed_rtx (enum machine_mode mode) +/* Fixup reload addresses for Altivec or VSX loads/stores to change SP+offset + to SP+reg addressing. */ + +void +rs6000_secondary_reload_inner (rtx reg, rtx mem, rtx scratch, bool store_p) { - static bool eliminated = false; - if (mode != SDmode) - return assign_stack_local (mode, GET_MODE_SIZE (mode), 0); - else - { - rtx mem = cfun->machine->sdmode_stack_slot; - gcc_assert (mem != NULL_RTX); + int regno = true_regnum (reg); + enum machine_mode mode = GET_MODE (reg); + enum reg_class rclass; + rtx addr; + rtx and_op2 = NULL_RTX; + rtx addr_op1; + rtx addr_op2; + rtx scratch_or_premodify = scratch; + rtx and_rtx; + rtx cc_clobber; + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, "\nrs6000_secondary_reload_inner, type = %s\n", + store_p ? "store" : "load"); + fprintf (stderr, "reg:\n"); + debug_rtx (reg); + fprintf (stderr, "mem:\n"); + debug_rtx (mem); + fprintf (stderr, "scratch:\n"); + debug_rtx (scratch); + } + + gcc_assert (regno >= 0 && regno < FIRST_PSEUDO_REGISTER); + gcc_assert (GET_CODE (mem) == MEM); + rclass = REGNO_REG_CLASS (regno); + addr = XEXP (mem, 0); + + switch (rclass) + { + /* GPRs can handle reg + small constant, all other addresses need to use + the scratch register. */ + case GENERAL_REGS: + case BASE_REGS: + if (GET_CODE (addr) == AND) + { + and_op2 = XEXP (addr, 1); + addr = XEXP (addr, 0); + } + + if (GET_CODE (addr) == PRE_MODIFY) + { + scratch_or_premodify = XEXP (addr, 0); + gcc_assert (REG_P (scratch_or_premodify)); + gcc_assert (GET_CODE (XEXP (addr, 1)) == PLUS); + addr = XEXP (addr, 1); + } + + if (GET_CODE (addr) == PLUS + && (!rs6000_legitimate_offset_address_p (TImode, addr, false) + || and_op2 != NULL_RTX)) + { + addr_op1 = XEXP (addr, 0); + addr_op2 = XEXP (addr, 1); + gcc_assert (legitimate_indirect_address_p (addr_op1, false)); + + if (!REG_P (addr_op2) + && (GET_CODE (addr_op2) != CONST_INT + || !satisfies_constraint_I (addr_op2))) + { + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, + "\nMove plus addr to register %s, mode = %s: ", + rs6000_reg_names[REGNO (scratch)], + GET_MODE_NAME (mode)); + debug_rtx (addr_op2); + } + rs6000_emit_move (scratch, addr_op2, Pmode); + addr_op2 = scratch; + } - if (!eliminated) + emit_insn (gen_rtx_SET (VOIDmode, + scratch_or_premodify, + gen_rtx_PLUS (Pmode, + addr_op1, + addr_op2))); + + addr = scratch_or_premodify; + scratch_or_premodify = scratch; + } + else if (!legitimate_indirect_address_p (addr, false) + && !rs6000_legitimate_offset_address_p (TImode, addr, false)) { - mem = eliminate_regs (mem, VOIDmode, NULL_RTX); - cfun->machine->sdmode_stack_slot = mem; - eliminated = true; + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, "\nMove addr to register %s, mode = %s: ", + rs6000_reg_names[REGNO (scratch_or_premodify)], + GET_MODE_NAME (mode)); + debug_rtx (addr); + } + rs6000_emit_move (scratch_or_premodify, addr, Pmode); + addr = scratch_or_premodify; + scratch_or_premodify = scratch; + } + break; + + /* Float/Altivec registers can only handle reg+reg addressing. Move + other addresses into a scratch register. */ + case FLOAT_REGS: + case VSX_REGS: + case ALTIVEC_REGS: + + /* With float regs, we need to handle the AND ourselves, since we can't + use the Altivec instruction with an implicit AND -16. Allow scalar + loads to float registers to use reg+offset even if VSX. */ + if (GET_CODE (addr) == AND + && (rclass != ALTIVEC_REGS || GET_MODE_SIZE (mode) != 16 + || GET_CODE (XEXP (addr, 1)) != CONST_INT + || INTVAL (XEXP (addr, 1)) != -16 + || !VECTOR_MEM_ALTIVEC_P (mode))) + { + and_op2 = XEXP (addr, 1); + addr = XEXP (addr, 0); + } + + /* If we aren't using a VSX load, save the PRE_MODIFY register and use it + as the address later. */ + if (GET_CODE (addr) == PRE_MODIFY + && (!VECTOR_MEM_VSX_P (mode) + || and_op2 != NULL_RTX + || !legitimate_indexed_address_p (XEXP (addr, 1), false))) + { + scratch_or_premodify = XEXP (addr, 0); + gcc_assert (legitimate_indirect_address_p (scratch_or_premodify, + false)); + gcc_assert (GET_CODE (XEXP (addr, 1)) == PLUS); + addr = XEXP (addr, 1); + } + + if (legitimate_indirect_address_p (addr, false) /* reg */ + || legitimate_indexed_address_p (addr, false) /* reg+reg */ + || GET_CODE (addr) == PRE_MODIFY /* VSX pre-modify */ + || (GET_CODE (addr) == AND /* Altivec memory */ + && GET_CODE (XEXP (addr, 1)) == CONST_INT + && INTVAL (XEXP (addr, 1)) == -16 + && VECTOR_MEM_ALTIVEC_P (mode)) + || (rclass == FLOAT_REGS /* legacy float mem */ + && GET_MODE_SIZE (mode) == 8 + && and_op2 == NULL_RTX + && scratch_or_premodify == scratch + && rs6000_legitimate_offset_address_p (mode, addr, false))) + ; + + else if (GET_CODE (addr) == PLUS) + { + addr_op1 = XEXP (addr, 0); + addr_op2 = XEXP (addr, 1); + gcc_assert (REG_P (addr_op1)); + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, "\nMove plus addr to register %s, mode = %s: ", + rs6000_reg_names[REGNO (scratch)], GET_MODE_NAME (mode)); + debug_rtx (addr_op2); + } + rs6000_emit_move (scratch, addr_op2, Pmode); + emit_insn (gen_rtx_SET (VOIDmode, + scratch_or_premodify, + gen_rtx_PLUS (Pmode, + addr_op1, + scratch))); + addr = scratch_or_premodify; + scratch_or_premodify = scratch; + } + + else if (GET_CODE (addr) == SYMBOL_REF || GET_CODE (addr) == CONST + || GET_CODE (addr) == CONST_INT || REG_P (addr)) + { + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, "\nMove addr to register %s, mode = %s: ", + rs6000_reg_names[REGNO (scratch_or_premodify)], + GET_MODE_NAME (mode)); + debug_rtx (addr); + } + + rs6000_emit_move (scratch_or_premodify, addr, Pmode); + addr = scratch_or_premodify; + scratch_or_premodify = scratch; } - return mem; + + else + gcc_unreachable (); + + break; + + default: + gcc_unreachable (); } -} -static tree -rs6000_check_sdmode (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED) -{ - /* Don't walk into types. */ - if (*tp == NULL_TREE || *tp == error_mark_node || TYPE_P (*tp)) + /* If the original address involved a pre-modify that we couldn't use the VSX + memory instruction with update, and we haven't taken care of already, + store the address in the pre-modify register and use that as the + address. */ + if (scratch_or_premodify != scratch && scratch_or_premodify != addr) { - *walk_subtrees = 0; - return NULL_TREE; + emit_insn (gen_rtx_SET (VOIDmode, scratch_or_premodify, addr)); + addr = scratch_or_premodify; } - switch (TREE_CODE (*tp)) + /* If the original address involved an AND -16 and we couldn't use an ALTIVEC + memory instruction, recreate the AND now, including the clobber which is + generated by the general ANDSI3/ANDDI3 patterns for the + andi. instruction. */ + if (and_op2 != NULL_RTX) { - case VAR_DECL: - case PARM_DECL: - case FIELD_DECL: - case RESULT_DECL: - case REAL_CST: - case INDIRECT_REF: - case ALIGN_INDIRECT_REF: - case MISALIGNED_INDIRECT_REF: - case VIEW_CONVERT_EXPR: - if (TYPE_MODE (TREE_TYPE (*tp)) == SDmode) - return *tp; - break; - default: - break; + if (! legitimate_indirect_address_p (addr, false)) + { + emit_insn (gen_rtx_SET (VOIDmode, scratch, addr)); + addr = scratch; + } + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, "\nAnd addr to register %s, mode = %s: ", + rs6000_reg_names[REGNO (scratch)], GET_MODE_NAME (mode)); + debug_rtx (and_op2); + } + + and_rtx = gen_rtx_SET (VOIDmode, + scratch, + gen_rtx_AND (Pmode, + addr, + and_op2)); + + cc_clobber = gen_rtx_CLOBBER (CCmode, gen_rtx_SCRATCH (CCmode)); + emit_insn (gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (2, and_rtx, cc_clobber))); + addr = scratch; } - return NULL_TREE; -} + /* Adjust the address if it changed. */ + if (addr != XEXP (mem, 0)) + { + mem = change_address (mem, mode, addr); + if (TARGET_DEBUG_ADDR) + fprintf (stderr, "\nrs6000_secondary_reload_inner, mem adjusted.\n"); + } + + /* Now create the move. */ + if (store_p) + emit_insn (gen_rtx_SET (VOIDmode, mem, reg)); + else + emit_insn (gen_rtx_SET (VOIDmode, reg, mem)); + return; +} /* Allocate a 64-bit stack slot to be used for copying SDmode values through if this function has any SDmode references. */ @@ -11323,13 +13454,163 @@ rs6000_instantiate_decls (void) instantiate_decl_rtl (cfun->machine->sdmode_stack_slot); } +/* Given an rtx X being reloaded into a reg required to be + in class CLASS, return the class of reg to actually use. + In general this is just CLASS; but on some machines + in some cases it is preferable to use a more restrictive class. + + On the RS/6000, we have to return NO_REGS when we want to reload a + floating-point CONST_DOUBLE to force it to be copied to memory. + + We also don't want to reload integer values into floating-point + registers if we can at all help it. In fact, this can + cause reload to die, if it tries to generate a reload of CTR + into a FP register and discovers it doesn't have the memory location + required. + + ??? Would it be a good idea to have reload do the converse, that is + try to reload floating modes into FP registers if possible? + */ + +static enum reg_class +rs6000_preferred_reload_class (rtx x, enum reg_class rclass) +{ + enum machine_mode mode = GET_MODE (x); + + if (VECTOR_UNIT_VSX_P (mode) + && x == CONST0_RTX (mode) && VSX_REG_CLASS_P (rclass)) + return rclass; + + if (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) + && (rclass == ALTIVEC_REGS || rclass == VSX_REGS) + && easy_vector_constant (x, mode)) + return ALTIVEC_REGS; + + if (CONSTANT_P (x) && reg_classes_intersect_p (rclass, FLOAT_REGS)) + return NO_REGS; + + if (GET_MODE_CLASS (mode) == MODE_INT && rclass == NON_SPECIAL_REGS) + return GENERAL_REGS; + + /* For VSX, prefer the traditional registers for DF if the address is of the + form reg+offset because we can use the non-VSX loads. Prefer the Altivec + registers if Altivec is handling the vector operations (i.e. V16QI, V8HI, + and V4SI). */ + if (rclass == VSX_REGS && VECTOR_MEM_VSX_P (mode)) + { + if (mode == DFmode && GET_CODE (x) == MEM) + { + rtx addr = XEXP (x, 0); + + if (legitimate_indirect_address_p (addr, false)) /* reg */ + return VSX_REGS; + + if (legitimate_indexed_address_p (addr, false)) /* reg+reg */ + return VSX_REGS; + + if (GET_CODE (addr) == PRE_MODIFY + && legitimate_indexed_address_p (XEXP (addr, 0), false)) + return VSX_REGS; + + return FLOAT_REGS; + } + + if (VECTOR_UNIT_ALTIVEC_P (mode)) + return ALTIVEC_REGS; + + return rclass; + } + + return rclass; +} + +/* Debug version of rs6000_preferred_reload_class. */ +static enum reg_class +rs6000_debug_preferred_reload_class (rtx x, enum reg_class rclass) +{ + enum reg_class ret = rs6000_preferred_reload_class (x, rclass); + + fprintf (stderr, + "\nrs6000_preferred_reload_class, return %s, rclass = %s, " + "mode = %s, x:\n", + reg_class_names[ret], reg_class_names[rclass], + GET_MODE_NAME (GET_MODE (x))); + debug_rtx (x); + + return ret; +} + +/* If we are copying between FP or AltiVec registers and anything else, we need + a memory location. The exception is when we are targeting ppc64 and the + move to/from fpr to gpr instructions are available. Also, under VSX, you + can copy vector registers from the FP register set to the Altivec register + set and vice versa. */ + +static bool +rs6000_secondary_memory_needed (enum reg_class class1, + enum reg_class class2, + enum machine_mode mode) +{ + if (class1 == class2) + return false; + + /* Under VSX, there are 3 register classes that values could be in (VSX_REGS, + ALTIVEC_REGS, and FLOAT_REGS). We don't need to use memory to copy + between these classes. But we need memory for other things that can go in + FLOAT_REGS like SFmode. */ + if (TARGET_VSX + && (VECTOR_MEM_VSX_P (mode) || VECTOR_UNIT_VSX_P (mode)) + && (class1 == VSX_REGS || class1 == ALTIVEC_REGS + || class1 == FLOAT_REGS)) + return (class2 != VSX_REGS && class2 != ALTIVEC_REGS + && class2 != FLOAT_REGS); + + if (class1 == VSX_REGS || class2 == VSX_REGS) + return true; + + if (class1 == FLOAT_REGS + && (!TARGET_MFPGPR || !TARGET_POWERPC64 + || ((mode != DFmode) + && (mode != DDmode) + && (mode != DImode)))) + return true; + + if (class2 == FLOAT_REGS + && (!TARGET_MFPGPR || !TARGET_POWERPC64 + || ((mode != DFmode) + && (mode != DDmode) + && (mode != DImode)))) + return true; + + if (class1 == ALTIVEC_REGS || class2 == ALTIVEC_REGS) + return true; + + return false; +} + +/* Debug version of rs6000_secondary_memory_needed. */ +static bool +rs6000_debug_secondary_memory_needed (enum reg_class class1, + enum reg_class class2, + enum machine_mode mode) +{ + bool ret = rs6000_secondary_memory_needed (class1, class2, mode); + + fprintf (stderr, + "rs6000_secondary_memory_needed, return: %s, class1 = %s, " + "class2 = %s, mode = %s\n", + ret ? "true" : "false", reg_class_names[class1], + reg_class_names[class2], GET_MODE_NAME (mode)); + + return ret; +} + /* Return the register class of a scratch register needed to copy IN into - or out of a register in CLASS in MODE. If it can be done directly, + or out of a register in RCLASS in MODE. If it can be done directly, NO_REGS is returned. */ -enum reg_class -rs6000_secondary_reload_class (enum reg_class class, - enum machine_mode mode ATTRIBUTE_UNUSED, +static enum reg_class +rs6000_secondary_reload_class (enum reg_class rclass, enum machine_mode mode, rtx in) { int regno; @@ -11347,7 +13628,7 @@ rs6000_secondary_reload_class (enum reg_ On Darwin, pic addresses require a load from memory, which needs a base register. */ - if (class != BASE_REGS + if (rclass != BASE_REGS && (GET_CODE (in) == SYMBOL_REF || GET_CODE (in) == HIGH || GET_CODE (in) == LABEL_REF @@ -11376,28 +13657,112 @@ rs6000_secondary_reload_class (enum reg_ /* We can place anything into GENERAL_REGS and can put GENERAL_REGS into anything. */ - if (class == GENERAL_REGS || class == BASE_REGS + if (rclass == GENERAL_REGS || rclass == BASE_REGS || (regno >= 0 && INT_REGNO_P (regno))) return NO_REGS; /* Constants, memory, and FP registers can go into FP registers. */ if ((regno == -1 || FP_REGNO_P (regno)) - && (class == FLOAT_REGS || class == NON_SPECIAL_REGS)) + && (rclass == FLOAT_REGS || rclass == NON_SPECIAL_REGS)) return (mode != SDmode) ? NO_REGS : GENERAL_REGS; + /* Memory, and FP/altivec registers can go into fp/altivec registers under + VSX. */ + if (TARGET_VSX + && (regno == -1 || VSX_REGNO_P (regno)) + && VSX_REG_CLASS_P (rclass)) + return NO_REGS; + /* Memory, and AltiVec registers can go into AltiVec registers. */ if ((regno == -1 || ALTIVEC_REGNO_P (regno)) - && class == ALTIVEC_REGS) + && rclass == ALTIVEC_REGS) return NO_REGS; /* We can copy among the CR registers. */ - if ((class == CR_REGS || class == CR0_REGS) + if ((rclass == CR_REGS || rclass == CR0_REGS) && regno >= 0 && CR_REGNO_P (regno)) return NO_REGS; /* Otherwise, we need GENERAL_REGS. */ return GENERAL_REGS; } + +/* Debug version of rs6000_secondary_reload_class. */ +static enum reg_class +rs6000_debug_secondary_reload_class (enum reg_class rclass, + enum machine_mode mode, rtx in) +{ + enum reg_class ret = rs6000_secondary_reload_class (rclass, mode, in); + fprintf (stderr, + "\nrs6000_secondary_reload_class, return %s, rclass = %s, " + "mode = %s, input rtx:\n", + reg_class_names[ret], reg_class_names[rclass], + GET_MODE_NAME (mode)); + debug_rtx (in); + + return ret; +} + +/* Return nonzero if for CLASS a mode change from FROM to TO is invalid. */ + +static bool +rs6000_cannot_change_mode_class (enum machine_mode from, + enum machine_mode to, + enum reg_class rclass) +{ + unsigned from_size = GET_MODE_SIZE (from); + unsigned to_size = GET_MODE_SIZE (to); + + if (from_size != to_size) + { + enum reg_class xclass = (TARGET_VSX) ? VSX_REGS : FLOAT_REGS; + return ((from_size < 8 || to_size < 8 || TARGET_IEEEQUAD) + && reg_classes_intersect_p (xclass, rclass)); + } + + if (TARGET_E500_DOUBLE + && ((((to) == DFmode) + ((from) == DFmode)) == 1 + || (((to) == TFmode) + ((from) == TFmode)) == 1 + || (((to) == DDmode) + ((from) == DDmode)) == 1 + || (((to) == TDmode) + ((from) == TDmode)) == 1 + || (((to) == DImode) + ((from) == DImode)) == 1)) + return true; + + /* Since the VSX register set includes traditional floating point registers + and altivec registers, just check for the size being different instead of + trying to check whether the modes are vector modes. Otherwise it won't + allow say DF and DI to change classes. */ + if (TARGET_VSX && VSX_REG_CLASS_P (rclass)) + return (from_size != 8 && from_size != 16); + + if (TARGET_ALTIVEC && rclass == ALTIVEC_REGS + && (ALTIVEC_VECTOR_MODE (from) + ALTIVEC_VECTOR_MODE (to)) == 1) + return true; + + if (TARGET_SPE && (SPE_VECTOR_MODE (from) + SPE_VECTOR_MODE (to)) == 1 + && reg_classes_intersect_p (GENERAL_REGS, rclass)) + return true; + + return false; +} + +/* Debug version of rs6000_cannot_change_mode_class. */ +static bool +rs6000_debug_cannot_change_mode_class (enum machine_mode from, + enum machine_mode to, + enum reg_class rclass) +{ + bool ret = rs6000_cannot_change_mode_class (from, to, rclass); + + fprintf (stderr, + "rs6000_cannot_change_mode_class, return %s, from = %s, " + "to = %s, rclass = %s\n", + ret ? "true" : "false", + GET_MODE_NAME (from), GET_MODE_NAME (to), + reg_class_names[rclass]); + + return ret; +} /* Given a comparison operation, return the bit number in CCR to test. We know this is a valid comparison. @@ -11691,7 +14056,7 @@ print_operand (FILE *file, rtx x, int co case 'c': /* X is a CR register. Print the number of the GT bit of the CR. */ if (GET_CODE (x) != REG || ! CR_REGNO_P (REGNO (x))) - output_operand_lossage ("invalid %%E value"); + output_operand_lossage ("invalid %%c value"); else fprintf (file, "%d", 4 * (REGNO (x) - CR0_REGNO) + 1); return; @@ -12128,6 +14493,26 @@ print_operand (FILE *file, rtx x, int co fprintf (file, "%d", i + 1); return; + case 'x': + /* X is a FPR or Altivec register used in a VSX context. */ + if (GET_CODE (x) != REG || !VSX_REGNO_P (REGNO (x))) + output_operand_lossage ("invalid %%x value"); + else + { + int reg = REGNO (x); + int vsx_reg = (FP_REGNO_P (reg) + ? reg - 32 + : reg - FIRST_ALTIVEC_REGNO + 32); + +#ifdef TARGET_REGNAMES + if (TARGET_REGNAMES) + fprintf (file, "%%vs%d", vsx_reg); + else +#endif + fprintf (file, "%d", vsx_reg); + } + return; + case 'X': if (GET_CODE (x) == MEM && (legitimate_indexed_address_p (XEXP (x, 0), 0) @@ -12240,18 +14625,25 @@ print_operand (FILE *file, rtx x, int co /* Fall through. Must be [reg+reg]. */ } - if (TARGET_ALTIVEC + if (VECTOR_MEM_ALTIVEC_P (GET_MODE (x)) && GET_CODE (tmp) == AND && GET_CODE (XEXP (tmp, 1)) == CONST_INT && INTVAL (XEXP (tmp, 1)) == -16) tmp = XEXP (tmp, 0); + else if (VECTOR_MEM_VSX_P (GET_MODE (x)) + && GET_CODE (tmp) == PRE_MODIFY) + tmp = XEXP (tmp, 1); if (GET_CODE (tmp) == REG) fprintf (file, "0,%s", reg_names[REGNO (tmp)]); else { - gcc_assert (GET_CODE (tmp) == PLUS - && REG_P (XEXP (tmp, 0)) - && REG_P (XEXP (tmp, 1))); + if (!GET_CODE (tmp) == PLUS + || !REG_P (XEXP (tmp, 0)) + || !REG_P (XEXP (tmp, 1))) + { + output_operand_lossage ("invalid %%y value, try using the 'Z' constraint"); + break; + } if (REGNO (XEXP (tmp, 0)) == 0) fprintf (file, "%s,%s", reg_names[ REGNO (XEXP (tmp, 1)) ], @@ -12343,48 +14735,45 @@ print_operand_address (FILE *file, rtx x #endif else if (legitimate_constant_pool_address_p (x)) { - if (TARGET_AIX && (!TARGET_ELF || !TARGET_MINIMAL_TOC)) - { - rtx contains_minus = XEXP (x, 1); - rtx minus, symref; - const char *name; - - /* Find the (minus (sym) (toc)) buried in X, and temporarily - turn it into (sym) for output_addr_const. */ - while (GET_CODE (XEXP (contains_minus, 0)) != MINUS) - contains_minus = XEXP (contains_minus, 0); - - minus = XEXP (contains_minus, 0); - symref = XEXP (minus, 0); - XEXP (contains_minus, 0) = symref; - if (TARGET_ELF) - { - char *newname; - - name = XSTR (symref, 0); - newname = alloca (strlen (name) + sizeof ("@toc")); - strcpy (newname, name); - strcat (newname, "@toc"); - XSTR (symref, 0) = newname; - } - output_addr_const (file, XEXP (x, 1)); - if (GET_CODE (XEXP (minus, 1)) == CONST - && (GET_CODE (XEXP (XEXP (minus, 1), 0)) == PLUS)) - fprintf (file, "+"HOST_WIDE_INT_PRINT_DEC, - -INTVAL (XEXP (XEXP (XEXP (minus, 1), 0), 1))); - if (TARGET_ELF) - XSTR (symref, 0) = name; - XEXP (contains_minus, 0) = minus; - } - else - output_addr_const (file, XEXP (x, 1)); - + output_addr_const (file, XEXP (x, 1)); fprintf (file, "(%s)", reg_names[REGNO (XEXP (x, 0))]); } else gcc_unreachable (); } +/* Implement OUTPUT_ADDR_CONST_EXTRA for address X. */ + +bool +rs6000_output_addr_const_extra (FILE *file, rtx x) +{ + if (GET_CODE (x) == UNSPEC) + switch (XINT (x, 1)) + { + case UNSPEC_TOCREL: + x = XVECEXP (x, 0, 0); + gcc_assert (GET_CODE (x) == SYMBOL_REF); + output_addr_const (file, x); + if (!TARGET_AIX || (TARGET_ELF && TARGET_MINIMAL_TOC)) + { + putc ('-', file); + assemble_name (file, toc_label_name); + } + else if (TARGET_ELF) + fputs ("@toc", file); + return true; + +#if TARGET_MACHO + case UNSPEC_MACHOPIC_OFFSET: + output_addr_const (file, XVECEXP (x, 0, 0)); + putc ('-', file); + machopic_output_function_base_name (file); + return true; +#endif + } + return false; +} + /* Target hook for assembling integer objects. The PowerPC version has to handle fixup entries for relocatable code if RELOCATABLE_NEEDS_FIXUP is defined. It also needs to handle DI-mode objects on 64-bit @@ -12540,7 +14929,7 @@ rs6000_generate_compare (enum rtx_code c switch (op_mode) { case SFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tstsfeq_gpr (compare_result, rs6000_compare_op0, rs6000_compare_op1) : gen_cmpsfeq_gpr (compare_result, rs6000_compare_op0, @@ -12548,7 +14937,7 @@ rs6000_generate_compare (enum rtx_code c break; case DFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tstdfeq_gpr (compare_result, rs6000_compare_op0, rs6000_compare_op1) : gen_cmpdfeq_gpr (compare_result, rs6000_compare_op0, @@ -12556,7 +14945,7 @@ rs6000_generate_compare (enum rtx_code c break; case TFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tsttfeq_gpr (compare_result, rs6000_compare_op0, rs6000_compare_op1) : gen_cmptfeq_gpr (compare_result, rs6000_compare_op0, @@ -12572,7 +14961,7 @@ rs6000_generate_compare (enum rtx_code c switch (op_mode) { case SFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tstsfgt_gpr (compare_result, rs6000_compare_op0, rs6000_compare_op1) : gen_cmpsfgt_gpr (compare_result, rs6000_compare_op0, @@ -12580,7 +14969,7 @@ rs6000_generate_compare (enum rtx_code c break; case DFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tstdfgt_gpr (compare_result, rs6000_compare_op0, rs6000_compare_op1) : gen_cmpdfgt_gpr (compare_result, rs6000_compare_op0, @@ -12588,7 +14977,7 @@ rs6000_generate_compare (enum rtx_code c break; case TFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tsttfgt_gpr (compare_result, rs6000_compare_op0, rs6000_compare_op1) : gen_cmptfgt_gpr (compare_result, rs6000_compare_op0, @@ -12604,7 +14993,7 @@ rs6000_generate_compare (enum rtx_code c switch (op_mode) { case SFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tstsflt_gpr (compare_result, rs6000_compare_op0, rs6000_compare_op1) : gen_cmpsflt_gpr (compare_result, rs6000_compare_op0, @@ -12612,7 +15001,7 @@ rs6000_generate_compare (enum rtx_code c break; case DFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tstdflt_gpr (compare_result, rs6000_compare_op0, rs6000_compare_op1) : gen_cmpdflt_gpr (compare_result, rs6000_compare_op0, @@ -12620,7 +15009,7 @@ rs6000_generate_compare (enum rtx_code c break; case TFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tsttflt_gpr (compare_result, rs6000_compare_op0, rs6000_compare_op1) : gen_cmptflt_gpr (compare_result, rs6000_compare_op0, @@ -12655,7 +15044,7 @@ rs6000_generate_compare (enum rtx_code c switch (op_mode) { case SFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tstsfeq_gpr (compare_result2, rs6000_compare_op0, rs6000_compare_op1) : gen_cmpsfeq_gpr (compare_result2, rs6000_compare_op0, @@ -12663,7 +15052,7 @@ rs6000_generate_compare (enum rtx_code c break; case DFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tstdfeq_gpr (compare_result2, rs6000_compare_op0, rs6000_compare_op1) : gen_cmpdfeq_gpr (compare_result2, rs6000_compare_op0, @@ -12671,7 +15060,7 @@ rs6000_generate_compare (enum rtx_code c break; case TFmode: - cmp = flag_unsafe_math_optimizations + cmp = (flag_finite_math_only && !flag_trapping_math) ? gen_tsttfeq_gpr (compare_result2, rs6000_compare_op0, rs6000_compare_op1) : gen_cmptfeq_gpr (compare_result2, rs6000_compare_op0, @@ -13017,55 +15406,34 @@ output_e500_flip_gt_bit (rtx dst, rtx sr return string; } -/* Return insn index for the vector compare instruction for given CODE, - and DEST_MODE, OP_MODE. Return INSN_NOT_AVAILABLE if valid insn is - not available. */ +/* Return insn for VSX or Altivec comparisons. */ -static int -get_vec_cmp_insn (enum rtx_code code, - enum machine_mode dest_mode, - enum machine_mode op_mode) +static rtx +rs6000_emit_vector_compare_inner (enum rtx_code code, rtx op0, rtx op1) { - if (!TARGET_ALTIVEC) - return INSN_NOT_AVAILABLE; + rtx mask; + enum machine_mode mode = GET_MODE (op0); switch (code) { - case EQ: - if (dest_mode == V16QImode && op_mode == V16QImode) - return UNSPEC_VCMPEQUB; - if (dest_mode == V8HImode && op_mode == V8HImode) - return UNSPEC_VCMPEQUH; - if (dest_mode == V4SImode && op_mode == V4SImode) - return UNSPEC_VCMPEQUW; - if (dest_mode == V4SImode && op_mode == V4SFmode) - return UNSPEC_VCMPEQFP; + default: break; + case GE: - if (dest_mode == V4SImode && op_mode == V4SFmode) - return UNSPEC_VCMPGEFP; + if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT) + return NULL_RTX; + + case EQ: case GT: - if (dest_mode == V16QImode && op_mode == V16QImode) - return UNSPEC_VCMPGTSB; - if (dest_mode == V8HImode && op_mode == V8HImode) - return UNSPEC_VCMPGTSH; - if (dest_mode == V4SImode && op_mode == V4SImode) - return UNSPEC_VCMPGTSW; - if (dest_mode == V4SImode && op_mode == V4SFmode) - return UNSPEC_VCMPGTFP; - break; case GTU: - if (dest_mode == V16QImode && op_mode == V16QImode) - return UNSPEC_VCMPGTUB; - if (dest_mode == V8HImode && op_mode == V8HImode) - return UNSPEC_VCMPGTUH; - if (dest_mode == V4SImode && op_mode == V4SImode) - return UNSPEC_VCMPGTUW; - break; - default: - break; + mask = gen_reg_rtx (mode); + emit_insn (gen_rtx_SET (VOIDmode, + mask, + gen_rtx_fmt_ee (code, mode, op0, op1))); + return mask; } - return INSN_NOT_AVAILABLE; + + return NULL_RTX; } /* Emit vector compare for operands OP0 and OP1 using code RCODE. @@ -13076,129 +15444,114 @@ rs6000_emit_vector_compare (enum rtx_cod rtx op0, rtx op1, enum machine_mode dmode) { - int vec_cmp_insn; rtx mask; - enum machine_mode dest_mode; - enum machine_mode op_mode = GET_MODE (op1); + bool swap_operands = false; + bool try_again = false; - gcc_assert (TARGET_ALTIVEC); + gcc_assert (VECTOR_UNIT_ALTIVEC_OR_VSX_P (dmode)); gcc_assert (GET_MODE (op0) == GET_MODE (op1)); - /* Floating point vector compare instructions uses destination V4SImode. - Move destination to appropriate mode later. */ - if (dmode == V4SFmode) - dest_mode = V4SImode; - else - dest_mode = dmode; - - mask = gen_reg_rtx (dest_mode); - vec_cmp_insn = get_vec_cmp_insn (rcode, dest_mode, op_mode); + /* See if the comparison works as is. */ + mask = rs6000_emit_vector_compare_inner (rcode, op0, op1); + if (mask) + return mask; - if (vec_cmp_insn == INSN_NOT_AVAILABLE) + switch (rcode) { - bool swap_operands = false; - bool try_again = false; - switch (rcode) - { - case LT: - rcode = GT; - swap_operands = true; - try_again = true; - break; - case LTU: - rcode = GTU; - swap_operands = true; - try_again = true; - break; - case NE: - case UNLE: - case UNLT: - case UNGE: - case UNGT: - /* Invert condition and try again. - e.g., A != B becomes ~(A==B). */ - { - enum rtx_code rev_code; - enum insn_code nor_code; - rtx eq_rtx; - - rev_code = reverse_condition_maybe_unordered (rcode); - eq_rtx = rs6000_emit_vector_compare (rev_code, op0, op1, - dest_mode); - - nor_code = optab_handler (one_cmpl_optab, (int)dest_mode)->insn_code; - gcc_assert (nor_code != CODE_FOR_nothing); - emit_insn (GEN_FCN (nor_code) (mask, eq_rtx)); + case LT: + rcode = GT; + swap_operands = true; + try_again = true; + break; + case LTU: + rcode = GTU; + swap_operands = true; + try_again = true; + break; + case NE: + case UNLE: + case UNLT: + case UNGE: + case UNGT: + /* Invert condition and try again. + e.g., A != B becomes ~(A==B). */ + { + enum rtx_code rev_code; + enum insn_code nor_code; + rtx mask2; + + rev_code = reverse_condition_maybe_unordered (rcode); + if (rev_code == UNKNOWN) + return NULL_RTX; + + nor_code = optab_handler (one_cmpl_optab, (int)dmode)->insn_code; + if (nor_code == CODE_FOR_nothing) + return NULL_RTX; + + mask2 = rs6000_emit_vector_compare (rev_code, op0, op1, dmode); + if (!mask2) + return NULL_RTX; + + mask = gen_reg_rtx (dmode); + emit_insn (GEN_FCN (nor_code) (mask, mask2)); + return mask; + } + break; + case GE: + case GEU: + case LE: + case LEU: + /* Try GT/GTU/LT/LTU OR EQ */ + { + rtx c_rtx, eq_rtx; + enum insn_code ior_code; + enum rtx_code new_code; - if (dmode != dest_mode) - { - rtx temp = gen_reg_rtx (dest_mode); - convert_move (temp, mask, 0); - return temp; - } - return mask; - } - break; - case GE: - case GEU: - case LE: - case LEU: - /* Try GT/GTU/LT/LTU OR EQ */ + switch (rcode) { - rtx c_rtx, eq_rtx; - enum insn_code ior_code; - enum rtx_code new_code; - - switch (rcode) - { - case GE: - new_code = GT; - break; - - case GEU: - new_code = GTU; - break; + case GE: + new_code = GT; + break; - case LE: - new_code = LT; - break; + case GEU: + new_code = GTU; + break; - case LEU: - new_code = LTU; - break; + case LE: + new_code = LT; + break; - default: - gcc_unreachable (); - } + case LEU: + new_code = LTU; + break; - c_rtx = rs6000_emit_vector_compare (new_code, - op0, op1, dest_mode); - eq_rtx = rs6000_emit_vector_compare (EQ, op0, op1, - dest_mode); - - ior_code = optab_handler (ior_optab, (int)dest_mode)->insn_code; - gcc_assert (ior_code != CODE_FOR_nothing); - emit_insn (GEN_FCN (ior_code) (mask, c_rtx, eq_rtx)); - if (dmode != dest_mode) - { - rtx temp = gen_reg_rtx (dest_mode); - convert_move (temp, mask, 0); - return temp; - } - return mask; + default: + gcc_unreachable (); } - break; - default: - gcc_unreachable (); - } - if (try_again) - { - vec_cmp_insn = get_vec_cmp_insn (rcode, dest_mode, op_mode); - /* You only get two chances. */ - gcc_assert (vec_cmp_insn != INSN_NOT_AVAILABLE); - } + ior_code = optab_handler (ior_optab, (int)dmode)->insn_code; + if (ior_code == CODE_FOR_nothing) + return NULL_RTX; + + c_rtx = rs6000_emit_vector_compare (new_code, op0, op1, dmode); + if (!c_rtx) + return NULL_RTX; + + eq_rtx = rs6000_emit_vector_compare (EQ, op0, op1, dmode); + if (!eq_rtx) + return NULL_RTX; + + mask = gen_reg_rtx (dmode); + emit_insn (GEN_FCN (ior_code) (mask, c_rtx, eq_rtx)); + return mask; + } + break; + default: + return NULL_RTX; + } + if (try_again) + { if (swap_operands) { rtx tmp; @@ -13206,91 +15559,84 @@ rs6000_emit_vector_compare (enum rtx_cod op0 = op1; op1 = tmp; } - } - emit_insn (gen_rtx_SET (VOIDmode, mask, - gen_rtx_UNSPEC (dest_mode, - gen_rtvec (2, op0, op1), - vec_cmp_insn))); - if (dmode != dest_mode) - { - rtx temp = gen_reg_rtx (dest_mode); - convert_move (temp, mask, 0); - return temp; - } - return mask; -} - -/* Return vector select instruction for MODE. Return INSN_NOT_AVAILABLE, if - valid insn doesn exist for given mode. */ - -static int -get_vsel_insn (enum machine_mode mode) -{ - switch (mode) - { - case V4SImode: - return UNSPEC_VSEL4SI; - break; - case V4SFmode: - return UNSPEC_VSEL4SF; - break; - case V8HImode: - return UNSPEC_VSEL8HI; - break; - case V16QImode: - return UNSPEC_VSEL16QI; - break; - default: - return INSN_NOT_AVAILABLE; - break; + mask = rs6000_emit_vector_compare_inner (rcode, op0, op1); + if (mask) + return mask; } - return INSN_NOT_AVAILABLE; -} - -/* Emit vector select insn where DEST is destination using - operands OP1, OP2 and MASK. */ - -static void -rs6000_emit_vector_select (rtx dest, rtx op1, rtx op2, rtx mask) -{ - rtx t, temp; - enum machine_mode dest_mode = GET_MODE (dest); - int vsel_insn_index = get_vsel_insn (GET_MODE (dest)); - - temp = gen_reg_rtx (dest_mode); - /* For each vector element, select op1 when mask is 1 otherwise - select op2. */ - t = gen_rtx_SET (VOIDmode, temp, - gen_rtx_UNSPEC (dest_mode, - gen_rtvec (3, op2, op1, mask), - vsel_insn_index)); - emit_insn (t); - emit_move_insn (dest, temp); - return; + /* You only get two chances. */ + return NULL_RTX; } -/* Emit vector conditional expression. - DEST is destination. OP1 and OP2 are two VEC_COND_EXPR operands. - CC_OP0 and CC_OP1 are the two operands for the relation operation COND. */ +/* Emit vector conditional expression. DEST is destination. OP_TRUE and + OP_FALSE are two VEC_COND_EXPR operands. CC_OP0 and CC_OP1 are the two + operands for the relation operation COND. */ int -rs6000_emit_vector_cond_expr (rtx dest, rtx op1, rtx op2, +rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false, rtx cond, rtx cc_op0, rtx cc_op1) { enum machine_mode dest_mode = GET_MODE (dest); enum rtx_code rcode = GET_CODE (cond); + enum machine_mode cc_mode = CCmode; rtx mask; + rtx cond2; + rtx tmp; + bool invert_move = false; - if (!TARGET_ALTIVEC) + if (VECTOR_UNIT_NONE_P (dest_mode)) return 0; + switch (rcode) + { + /* Swap operands if we can, and fall back to doing the operation as + specified, and doing a NOR to invert the test. */ + case NE: + case UNLE: + case UNLT: + case UNGE: + case UNGT: + /* Invert condition and try again. + e.g., A = (B != C) ? D : E becomes A = (B == C) ? E : D. */ + invert_move = true; + rcode = reverse_condition_maybe_unordered (rcode); + if (rcode == UNKNOWN) + return 0; + break; + + /* Mark unsigned tests with CCUNSmode. */ + case GTU: + case GEU: + case LTU: + case LEU: + cc_mode = CCUNSmode; + break; + + default: + break; + } + /* Get the vector mask for the given relational operations. */ mask = rs6000_emit_vector_compare (rcode, cc_op0, cc_op1, dest_mode); - rs6000_emit_vector_select (dest, op1, op2, mask); + if (!mask) + return 0; + + if (invert_move) + { + tmp = op_true; + op_true = op_false; + op_false = tmp; + } + cond2 = gen_rtx_fmt_ee (NE, cc_mode, mask, const0_rtx); + emit_insn (gen_rtx_SET (VOIDmode, + dest, + gen_rtx_IF_THEN_ELSE (dest_mode, + cond2, + op_true, + op_false))); return 1; } @@ -13487,8 +15833,8 @@ rs6000_emit_int_cmove (rtx dest, rtx op, { rtx condition_rtx, cr; - /* All isel implementations thus far are 32-bits. */ - if (GET_MODE (rs6000_compare_op0) != SImode) + if (GET_MODE (rs6000_compare_op0) != SImode + && (!TARGET_POWERPC64 || GET_MODE (rs6000_compare_op0) != DImode)) return 0; /* We still have to do the compare, because isel doesn't do a @@ -13497,12 +15843,24 @@ rs6000_emit_int_cmove (rtx dest, rtx op, condition_rtx = rs6000_generate_compare (GET_CODE (op)); cr = XEXP (condition_rtx, 0); - if (GET_MODE (cr) == CCmode) - emit_insn (gen_isel_signed (dest, condition_rtx, - true_cond, false_cond, cr)); + if (GET_MODE (rs6000_compare_op0) == SImode) + { + if (GET_MODE (cr) == CCmode) + emit_insn (gen_isel_signed_si (dest, condition_rtx, + true_cond, false_cond, cr)); + else + emit_insn (gen_isel_unsigned_si (dest, condition_rtx, + true_cond, false_cond, cr)); + } else - emit_insn (gen_isel_unsigned (dest, condition_rtx, - true_cond, false_cond, cr)); + { + if (GET_MODE (cr) == CCmode) + emit_insn (gen_isel_signed_di (dest, condition_rtx, + true_cond, false_cond, cr)); + else + emit_insn (gen_isel_unsigned_di (dest, condition_rtx, + true_cond, false_cond, cr)); + } return 1; } @@ -13529,6 +15887,15 @@ rs6000_emit_minmax (rtx dest, enum rtx_c enum rtx_code c; rtx target; + /* VSX/altivec have direct min/max insns. */ + if ((code == SMAX || code == SMIN) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)) + { + emit_insn (gen_rtx_SET (VOIDmode, + dest, + gen_rtx_fmt_ee (code, mode, op0, op1))); + return; + } + if (code == SMAX || code == SMIN) c = GE; else @@ -14009,11 +16376,12 @@ rs6000_split_multireg_move (rtx dst, rtx mode = GET_MODE (dst); nregs = hard_regno_nregs[reg][mode]; if (FP_REGNO_P (reg)) - reg_mode = DECIMAL_FLOAT_MODE_P (mode) ? DDmode : DFmode; + reg_mode = DECIMAL_FLOAT_MODE_P (mode) ? DDmode : + ((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) ? DFmode : SFmode); else if (ALTIVEC_REGNO_P (reg)) reg_mode = V16QImode; - else if (TARGET_E500_DOUBLE && (mode == TFmode || mode == TDmode)) - reg_mode = DECIMAL_FLOAT_MODE_P (mode) ? DDmode : DFmode; + else if (TARGET_E500_DOUBLE && mode == TFmode) + reg_mode = DFmode; else reg_mode = word_mode; reg_mode_size = GET_MODE_SIZE (reg_mode); @@ -14050,9 +16418,7 @@ rs6000_split_multireg_move (rtx dst, rtx delta_rtx = (GET_CODE (XEXP (src, 0)) == PRE_INC ? GEN_INT (GET_MODE_SIZE (GET_MODE (src))) : GEN_INT (-GET_MODE_SIZE (GET_MODE (src)))); - emit_insn (TARGET_32BIT - ? gen_addsi3 (breg, breg, delta_rtx) - : gen_adddi3 (breg, breg, delta_rtx)); + emit_insn (gen_add3_insn (breg, breg, delta_rtx)); src = replace_equiv_address (src, breg); } else if (! rs6000_offsettable_memref_p (src)) @@ -14102,9 +16468,7 @@ rs6000_split_multireg_move (rtx dst, rtx used_update = true; } else - emit_insn (TARGET_32BIT - ? gen_addsi3 (breg, breg, delta_rtx) - : gen_adddi3 (breg, breg, delta_rtx)); + emit_insn (gen_add3_insn (breg, breg, delta_rtx)); dst = replace_equiv_address (dst, breg); } else @@ -14755,8 +17119,7 @@ spe_func_has_64bit_regs_p (void) if (SPE_VECTOR_MODE (mode)) return true; - if (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode - || mode == DDmode || mode == TDmode)) + if (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) return true; } } @@ -15206,13 +17569,25 @@ uses_TOC (void) rtx create_TOC_reference (rtx symbol) { + if (TARGET_DEBUG_ADDR) + { + if (GET_CODE (symbol) == SYMBOL_REF) + fprintf (stderr, "\ncreate_TOC_reference, (symbol_ref %s)\n", + XSTR (symbol, 0)); + else + { + fprintf (stderr, "\ncreate_TOC_reference, code %s:\n", + GET_RTX_NAME (GET_CODE (symbol))); + debug_rtx (symbol); + } + } + if (!can_create_pseudo_p ()) df_set_regs_ever_live (TOC_REGISTER, true); return gen_rtx_PLUS (Pmode, gen_rtx_REG (Pmode, TOC_REGISTER), gen_rtx_CONST (Pmode, - gen_rtx_MINUS (Pmode, symbol, - gen_rtx_SYMBOL_REF (Pmode, toc_label_name)))); + gen_rtx_UNSPEC (Pmode, gen_rtvec (1, symbol), UNSPEC_TOCREL))); } /* If _Unwind_* has been called from within the same module, @@ -15265,15 +17640,18 @@ rs6000_emit_stack_tie (void) /* Emit the correct code for allocating stack space, as insns. If COPY_R12, make sure a copy of the old frame is left in r12. + If COPY_R11, make sure a copy of the old frame is left in r11, + in preference to r12 if COPY_R12. The generated code may use hard register 0 as a temporary. */ static void -rs6000_emit_allocate_stack (HOST_WIDE_INT size, int copy_r12) +rs6000_emit_allocate_stack (HOST_WIDE_INT size, int copy_r12, int copy_r11) { rtx insn; rtx stack_reg = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); rtx tmp_reg = gen_rtx_REG (Pmode, 0); rtx todec = gen_int_mode (-size, Pmode); + rtx par, set, mem; if (INTVAL (todec) != -size) { @@ -15288,14 +17666,7 @@ rs6000_emit_allocate_stack (HOST_WIDE_IN && REGNO (stack_limit_rtx) > 1 && REGNO (stack_limit_rtx) <= 31) { - emit_insn (TARGET_32BIT - ? gen_addsi3 (tmp_reg, - stack_limit_rtx, - GEN_INT (size)) - : gen_adddi3 (tmp_reg, - stack_limit_rtx, - GEN_INT (size))); - + emit_insn (gen_add3_insn (tmp_reg, stack_limit_rtx, GEN_INT (size))); emit_insn (gen_cond_trap (LTU, stack_reg, tmp_reg, const0_rtx)); } @@ -15317,35 +17688,39 @@ rs6000_emit_allocate_stack (HOST_WIDE_IN warning (0, "stack limit expression is not supported"); } - if (copy_r12 || ! TARGET_UPDATE) - emit_move_insn (gen_rtx_REG (Pmode, 12), stack_reg); - - if (TARGET_UPDATE) - { - if (size > 32767) - { - /* Need a note here so that try_split doesn't get confused. */ - if (get_last_insn () == NULL_RTX) - emit_note (NOTE_INSN_DELETED); - insn = emit_move_insn (tmp_reg, todec); - try_split (PATTERN (insn), insn, 0); - todec = tmp_reg; - } - - insn = emit_insn (TARGET_32BIT - ? gen_movsi_update (stack_reg, stack_reg, - todec, stack_reg) - : gen_movdi_di_update (stack_reg, stack_reg, - todec, stack_reg)); - } - else - { - insn = emit_insn (TARGET_32BIT - ? gen_addsi3 (stack_reg, stack_reg, todec) - : gen_adddi3 (stack_reg, stack_reg, todec)); - emit_move_insn (gen_rtx_MEM (Pmode, stack_reg), - gen_rtx_REG (Pmode, 12)); + if (copy_r12 || copy_r11) + emit_move_insn (copy_r11 + ? gen_rtx_REG (Pmode, 11) + : gen_rtx_REG (Pmode, 12), + stack_reg); + + if (size > 32767) + { + /* Need a note here so that try_split doesn't get confused. */ + if (get_last_insn () == NULL_RTX) + emit_note (NOTE_INSN_DELETED); + insn = emit_move_insn (tmp_reg, todec); + try_split (PATTERN (insn), insn, 0); + todec = tmp_reg; } + + insn = emit_insn (TARGET_32BIT + ? gen_movsi_update_stack (stack_reg, stack_reg, + todec, stack_reg) + : gen_movdi_di_update_stack (stack_reg, stack_reg, + todec, stack_reg)); + /* Since we didn't use gen_frame_mem to generate the MEM, grab + it now and set the alias set/attributes. The above gen_*_update + calls will generate a PARALLEL with the MEM set being the first + operation. */ + par = PATTERN (insn); + gcc_assert (GET_CODE (par) == PARALLEL); + set = XVECEXP (par, 0, 0); + gcc_assert (GET_CODE (set) == SET); + mem = SET_DEST (set); + gcc_assert (MEM_P (mem)); + MEM_NOTRAP_P (mem) = 1; + set_mem_alias_set (mem, get_frame_alias_set ()); RTX_FRAME_RELATED_P (insn) = 1; REG_NOTES (insn) = @@ -15433,77 +17808,12 @@ rs6000_frame_related (rtx insn, rtx reg, } } - if (TARGET_SPE) - real = spe_synthesize_frame_save (real); - RTX_FRAME_RELATED_P (insn) = 1; REG_NOTES (insn) = gen_rtx_EXPR_LIST (REG_FRAME_RELATED_EXPR, real, REG_NOTES (insn)); } -/* Given an SPE frame note, return a PARALLEL of SETs with the - original note, plus a synthetic register save. */ - -static rtx -spe_synthesize_frame_save (rtx real) -{ - rtx synth, offset, reg, real2; - - if (GET_CODE (real) != SET - || GET_MODE (SET_SRC (real)) != V2SImode) - return real; - - /* For the SPE, registers saved in 64-bits, get a PARALLEL for their - frame related note. The parallel contains a set of the register - being saved, and another set to a synthetic register (n+1200). - This is so we can differentiate between 64-bit and 32-bit saves. - Words cannot describe this nastiness. */ - - gcc_assert (GET_CODE (SET_DEST (real)) == MEM - && GET_CODE (XEXP (SET_DEST (real), 0)) == PLUS - && GET_CODE (SET_SRC (real)) == REG); - - /* Transform: - (set (mem (plus (reg x) (const y))) - (reg z)) - into: - (set (mem (plus (reg x) (const y+4))) - (reg z+1200)) - */ - - real2 = copy_rtx (real); - PUT_MODE (SET_DEST (real2), SImode); - reg = SET_SRC (real2); - real2 = replace_rtx (real2, reg, gen_rtx_REG (SImode, REGNO (reg))); - synth = copy_rtx (real2); - - if (BYTES_BIG_ENDIAN) - { - offset = XEXP (XEXP (SET_DEST (real2), 0), 1); - real2 = replace_rtx (real2, offset, GEN_INT (INTVAL (offset) + 4)); - } - - reg = SET_SRC (synth); - - synth = replace_rtx (synth, reg, - gen_rtx_REG (SImode, REGNO (reg) + 1200)); - - offset = XEXP (XEXP (SET_DEST (synth), 0), 1); - synth = replace_rtx (synth, offset, - GEN_INT (INTVAL (offset) - + (BYTES_BIG_ENDIAN ? 0 : 4))); - - RTX_FRAME_RELATED_P (synth) = 1; - RTX_FRAME_RELATED_P (real2) = 1; - if (BYTES_BIG_ENDIAN) - real = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, synth, real2)); - else - real = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, real2, synth)); - - return real; -} - /* Returns an insn that has a vrsave set operation with the appropriate CLOBBERs. */ @@ -15577,7 +17887,8 @@ emit_frame_save (rtx frame_reg, rtx fram /* Some cases that need register indexed addressing. */ if ((TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode)) - || (TARGET_E500_DOUBLE && (mode == DFmode || mode == DDmode)) + || (TARGET_VSX && VSX_VECTOR_MODE (mode)) + || (TARGET_E500_DOUBLE && mode == DFmode) || (TARGET_SPE_ABI && SPE_VECTOR_MODE (mode) && !SPE_CONST_OFFSET_OK (offset))) @@ -15617,7 +17928,7 @@ gen_frame_mem_offset (enum machine_mode int_rtx = GEN_INT (offset); if ((TARGET_SPE_ABI && SPE_VECTOR_MODE (mode)) - || (TARGET_E500_DOUBLE && (mode == DFmode || mode == DDmode))) + || (TARGET_E500_DOUBLE && mode == DFmode)) { offset_rtx = gen_rtx_REG (Pmode, FIXED_SCRATCH); emit_move_insn (offset_rtx, int_rtx); @@ -15632,11 +17943,11 @@ gen_frame_mem_offset (enum machine_mode and cannot use stmw/lmw if there are any in its range. */ static bool -no_global_regs_above (int first_greg) +no_global_regs_above (int first, bool gpr) { int i; - for (i = 0; i < 32 - first_greg; i++) - if (global_regs[first_greg + i]) + for (i = first; i < gpr ? 32 : 64 ; i++) + if (global_regs[i]) return false; return true; } @@ -15645,6 +17956,162 @@ no_global_regs_above (int first_greg) #define TARGET_FIX_AND_CONTINUE 0 #endif +/* It's really GPR 13 and FPR 14, but we need the smaller of the two. */ +#define FIRST_SAVRES_REGISTER FIRST_SAVED_GP_REGNO +#define LAST_SAVRES_REGISTER 31 +#define N_SAVRES_REGISTERS (LAST_SAVRES_REGISTER - FIRST_SAVRES_REGISTER + 1) + +static GTY(()) rtx savres_routine_syms[N_SAVRES_REGISTERS][8]; + +/* Return the symbol for an out-of-line register save/restore routine. + We are saving/restoring GPRs if GPR is true. */ + +static rtx +rs6000_savres_routine_sym (rs6000_stack_t *info, bool savep, bool gpr, bool exitp) +{ + int regno = gpr ? info->first_gp_reg_save : (info->first_fp_reg_save - 32); + rtx sym; + int select = ((savep ? 1 : 0) << 2 + | (gpr + /* On the SPE, we never have any FPRs, but we do have + 32/64-bit versions of the routines. */ + ? (TARGET_SPE_ABI && info->spe_64bit_regs_used ? 1 : 0) + : 0) << 1 + | (exitp ? 1: 0)); + + /* Don't generate bogus routine names. */ + gcc_assert (FIRST_SAVRES_REGISTER <= regno && regno <= LAST_SAVRES_REGISTER); + + sym = savres_routine_syms[regno-FIRST_SAVRES_REGISTER][select]; + + if (sym == NULL) + { + char name[30]; + const char *action; + const char *regkind; + const char *exit_suffix; + + action = savep ? "save" : "rest"; + + /* SPE has slightly different names for its routines depending on + whether we are saving 32-bit or 64-bit registers. */ + if (TARGET_SPE_ABI) + { + /* No floating point saves on the SPE. */ + gcc_assert (gpr); + + regkind = info->spe_64bit_regs_used ? "64gpr" : "32gpr"; + } + else + regkind = gpr ? "gpr" : "fpr"; + + exit_suffix = exitp ? "_x" : ""; + + sprintf (name, "_%s%s_%d%s", action, regkind, regno, exit_suffix); + + sym = savres_routine_syms[regno-FIRST_SAVRES_REGISTER][select] + = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (name)); + } + + return sym; +} + +/* Emit a sequence of insns, including a stack tie if needed, for + resetting the stack pointer. If SAVRES is true, then don't reset the + stack pointer, but move the base of the frame into r11 for use by + out-of-line register restore routines. */ + +static void +rs6000_emit_stack_reset (rs6000_stack_t *info, + rtx sp_reg_rtx, rtx frame_reg_rtx, + int sp_offset, bool savres) +{ + /* This blockage is needed so that sched doesn't decide to move + the sp change before the register restores. */ + if (frame_reg_rtx != sp_reg_rtx + || (TARGET_SPE_ABI + && info->spe_64bit_regs_used != 0 + && info->first_gp_reg_save != 32)) + rs6000_emit_stack_tie (); + + if (frame_reg_rtx != sp_reg_rtx) + { + if (sp_offset != 0) + emit_insn (gen_addsi3 (sp_reg_rtx, frame_reg_rtx, + GEN_INT (sp_offset))); + else if (!savres) + emit_move_insn (sp_reg_rtx, frame_reg_rtx); + } + else if (sp_offset != 0) + { + /* If we are restoring registers out-of-line, we will be using the + "exit" variants of the restore routines, which will reset the + stack for us. But we do need to point r11 into the right place + for those routines. */ + rtx dest_reg = (savres + ? gen_rtx_REG (Pmode, 11) + : sp_reg_rtx); + + emit_insn (TARGET_32BIT + ? gen_addsi3 (dest_reg, sp_reg_rtx, + GEN_INT (sp_offset)) + : gen_adddi3 (dest_reg, sp_reg_rtx, + GEN_INT (sp_offset))); + } +} + +/* Construct a parallel rtx describing the effect of a call to an + out-of-line register save/restore routine. */ + +static rtx +rs6000_make_savres_rtx (rs6000_stack_t *info, + rtx frame_reg_rtx, int save_area_offset, + enum machine_mode reg_mode, + bool savep, bool gpr, bool exitp) +{ + int i; + int offset, start_reg, end_reg, n_regs; + int reg_size = GET_MODE_SIZE (reg_mode); + rtx sym; + rtvec p; + + offset = 0; + start_reg = (gpr + ? info->first_gp_reg_save + : info->first_fp_reg_save); + end_reg = gpr ? 32 : 64; + n_regs = end_reg - start_reg; + p = rtvec_alloc ((exitp ? 4 : 3) + n_regs); + + /* If we're saving registers, then we should never say we're exiting. */ + gcc_assert ((savep && !exitp) || !savep); + + if (exitp) + RTVEC_ELT (p, offset++) = gen_rtx_RETURN (VOIDmode); + + RTVEC_ELT (p, offset++) + = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, 65)); + + sym = rs6000_savres_routine_sym (info, savep, gpr, exitp); + RTVEC_ELT (p, offset++) = gen_rtx_USE (VOIDmode, sym); + RTVEC_ELT (p, offset++) = gen_rtx_USE (VOIDmode, gen_rtx_REG (Pmode, 11)); + + for (i = 0; i < end_reg - start_reg; i++) + { + rtx addr, reg, mem; + reg = gen_rtx_REG (reg_mode, start_reg + i); + addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, + GEN_INT (save_area_offset + reg_size*i)); + mem = gen_frame_mem (reg_mode, addr); + + RTVEC_ELT (p, i + offset) = gen_rtx_SET (VOIDmode, + savep ? mem : reg, + savep ? reg : mem); + } + + return gen_rtx_PARALLEL (VOIDmode, p); +} + /* Determine whether the gp REG is really used. */ static bool @@ -15659,6 +18126,85 @@ rs6000_reg_live_or_pic_offset_p (int reg || (DEFAULT_ABI == ABI_DARWIN && flag_pic)))); } +enum { + SAVRES_MULTIPLE = 0x1, + SAVRES_INLINE_FPRS = 0x2, + SAVRES_INLINE_GPRS = 0x4 +}; + +/* Determine the strategy for savings/restoring registers. */ + +static int +rs6000_savres_strategy (rs6000_stack_t *info, bool savep, + int using_static_chain_p, int sibcall) +{ + bool using_multiple_p; + bool common; + bool savres_fprs_inline; + bool savres_gprs_inline; + bool noclobber_global_gprs + = no_global_regs_above (info->first_gp_reg_save, /*gpr=*/true); + + using_multiple_p = (TARGET_MULTIPLE && ! TARGET_POWERPC64 + && (!TARGET_SPE_ABI + || info->spe_64bit_regs_used == 0) + && info->first_gp_reg_save < 31 + && noclobber_global_gprs); + /* Don't bother to try to save things out-of-line if r11 is occupied + by the static chain. It would require too much fiddling and the + static chain is rarely used anyway. */ + common = (using_static_chain_p + || sibcall + || current_function_calls_eh_return + || !info->lr_save_p + || cfun->machine->ra_need_lr + || info->total_size > 32767); + savres_fprs_inline = (common + || info->first_fp_reg_save == 64 + || !no_global_regs_above (info->first_fp_reg_save, + /*gpr=*/false) + || FP_SAVE_INLINE (info->first_fp_reg_save)); + savres_gprs_inline = (common + /* Saving CR interferes with the exit routines + used on the SPE, so just punt here. */ + || (!savep + && TARGET_SPE_ABI + && info->spe_64bit_regs_used != 0 + && info->cr_save_p != 0) + || info->first_gp_reg_save == 32 + || !noclobber_global_gprs + || GP_SAVE_INLINE (info->first_gp_reg_save)); + + if (savep) + /* If we are going to use store multiple, then don't even bother + with the out-of-line routines, since the store-multiple instruction + will always be smaller. */ + savres_gprs_inline = savres_gprs_inline || using_multiple_p; + else + { + /* The situation is more complicated with load multiple. We'd + prefer to use the out-of-line routines for restores, since the + "exit" out-of-line routines can handle the restore of LR and + the frame teardown. But we can only use the out-of-line + routines if we know that we've used store multiple or + out-of-line routines in the prologue, i.e. if we've saved all + the registers from first_gp_reg_save. Otherwise, we risk + loading garbage from the stack. Furthermore, we can only use + the "exit" out-of-line gpr restore if we haven't saved any + fprs. */ + bool saved_all = !savres_gprs_inline || using_multiple_p; + + if (saved_all && info->first_fp_reg_save != 64) + /* We can't use the exit routine; use load multiple if it's + available. */ + savres_gprs_inline = savres_gprs_inline || using_multiple_p; + } + + return (using_multiple_p + | (savres_fprs_inline << 1) + | (savres_gprs_inline << 2)); +} + /* Emit function prologue as insns. */ void @@ -15672,8 +18218,13 @@ rs6000_emit_prologue (void) rtx frame_reg_rtx = sp_reg_rtx; rtx cr_save_rtx = NULL_RTX; rtx insn; + int strategy; int saving_FPRs_inline; + int saving_GPRs_inline; int using_store_multiple; + int using_static_chain_p = (cfun->static_chain_decl != NULL_TREE + && df_regs_ever_live_p (STATIC_CHAIN_REGNUM) + && !call_used_regs[STATIC_CHAIN_REGNUM]); HOST_WIDE_INT sp_offset = 0; if (TARGET_FIX_AND_CONTINUE) @@ -15696,15 +18247,12 @@ rs6000_emit_prologue (void) reg_size = 8; } - using_store_multiple = (TARGET_MULTIPLE && ! TARGET_POWERPC64 - && (!TARGET_SPE_ABI - || info->spe_64bit_regs_used == 0) - && info->first_gp_reg_save < 31 - && no_global_regs_above (info->first_gp_reg_save)); - saving_FPRs_inline = (info->first_fp_reg_save == 64 - || FP_SAVE_INLINE (info->first_fp_reg_save) - || current_function_calls_eh_return - || cfun->machine->ra_need_lr); + strategy = rs6000_savres_strategy (info, /*savep=*/true, + /*static_chain_p=*/using_static_chain_p, + /*sibcall=*/0); + using_store_multiple = strategy & SAVRES_MULTIPLE; + saving_FPRs_inline = strategy & SAVRES_INLINE_FPRS; + saving_GPRs_inline = strategy & SAVRES_INLINE_GPRS; /* For V.4, update stack before we do any saving and set back pointer. */ if (! WORLD_SAVE_P (info) @@ -15712,17 +18260,24 @@ rs6000_emit_prologue (void) && (DEFAULT_ABI == ABI_V4 || current_function_calls_eh_return)) { + bool need_r11 = (TARGET_SPE + ? (!saving_GPRs_inline + && info->spe_64bit_regs_used == 0) + : (!saving_FPRs_inline || !saving_GPRs_inline)); if (info->total_size < 32767) sp_offset = info->total_size; else - frame_reg_rtx = frame_ptr_rtx; + frame_reg_rtx = (need_r11 + ? gen_rtx_REG (Pmode, 11) + : frame_ptr_rtx); rs6000_emit_allocate_stack (info->total_size, (frame_reg_rtx != sp_reg_rtx && (info->cr_save_p || info->lr_save_p || info->first_fp_reg_save < 64 || info->first_gp_reg_save < 32 - ))); + )), + need_r11); if (frame_reg_rtx != sp_reg_rtx) rs6000_emit_stack_tie (); } @@ -15781,11 +18336,14 @@ rs6000_emit_prologue (void) properly. */ for (i = 0; i < 64 - info->first_fp_reg_save; i++) { - rtx reg = gen_rtx_REG (DFmode, info->first_fp_reg_save + i); + rtx reg = gen_rtx_REG (((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) + ? DFmode : SFmode), + info->first_fp_reg_save + i); rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, GEN_INT (info->fp_save_offset + sp_offset + 8 * i)); - rtx mem = gen_frame_mem (DFmode, addr); + rtx mem = gen_frame_mem (((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) + ? DFmode : SFmode), addr); RTVEC_ELT (p, j++) = gen_rtx_SET (VOIDmode, mem, reg); } @@ -15892,47 +18450,154 @@ rs6000_emit_prologue (void) for (i = 0; i < 64 - info->first_fp_reg_save; i++) if ((df_regs_ever_live_p (info->first_fp_reg_save+i) && ! call_used_regs[info->first_fp_reg_save+i])) - emit_frame_save (frame_reg_rtx, frame_ptr_rtx, DFmode, + emit_frame_save (frame_reg_rtx, frame_ptr_rtx, + (TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) + ? DFmode : SFmode, info->first_fp_reg_save + i, info->fp_save_offset + sp_offset + 8 * i, info->total_size); } else if (!WORLD_SAVE_P (info) && info->first_fp_reg_save != 64) { - int i; - char rname[30]; - const char *alloc_rname; - rtvec p; - p = rtvec_alloc (2 + 64 - info->first_fp_reg_save); - - RTVEC_ELT (p, 0) = gen_rtx_CLOBBER (VOIDmode, - gen_rtx_REG (Pmode, - LR_REGNO)); - sprintf (rname, "%s%d%s", SAVE_FP_PREFIX, - info->first_fp_reg_save - 32, SAVE_FP_SUFFIX); - alloc_rname = ggc_strdup (rname); - RTVEC_ELT (p, 1) = gen_rtx_USE (VOIDmode, - gen_rtx_SYMBOL_REF (Pmode, - alloc_rname)); - for (i = 0; i < 64 - info->first_fp_reg_save; i++) - { - rtx addr, reg, mem; - reg = gen_rtx_REG (DFmode, info->first_fp_reg_save + i); - addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, - GEN_INT (info->fp_save_offset - + sp_offset + 8*i)); - mem = gen_frame_mem (DFmode, addr); + rtx par; - RTVEC_ELT (p, i + 2) = gen_rtx_SET (VOIDmode, mem, reg); - } - insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, p)); + par = rs6000_make_savres_rtx (info, frame_reg_rtx, + info->fp_save_offset + sp_offset, + DFmode, + /*savep=*/true, /*gpr=*/false, + /*exitp=*/false); + insn = emit_insn (par); rs6000_frame_related (insn, frame_ptr_rtx, info->total_size, NULL_RTX, NULL_RTX); } /* Save GPRs. This is done as a PARALLEL if we are using the store-multiple instructions. */ - if (!WORLD_SAVE_P (info) && using_store_multiple) + if (!WORLD_SAVE_P (info) + && TARGET_SPE_ABI + && info->spe_64bit_regs_used != 0 + && info->first_gp_reg_save != 32) + { + int i; + rtx spe_save_area_ptr; + + /* Determine whether we can address all of the registers that need + to be saved with an offset from the stack pointer that fits in + the small const field for SPE memory instructions. */ + int spe_regs_addressable_via_sp + = (SPE_CONST_OFFSET_OK(info->spe_gp_save_offset + sp_offset + + (32 - info->first_gp_reg_save - 1) * reg_size) + && saving_GPRs_inline); + int spe_offset; + + if (spe_regs_addressable_via_sp) + { + spe_save_area_ptr = frame_reg_rtx; + spe_offset = info->spe_gp_save_offset + sp_offset; + } + else + { + /* Make r11 point to the start of the SPE save area. We need + to be careful here if r11 is holding the static chain. If + it is, then temporarily save it in r0. We would use r0 as + our base register here, but using r0 as a base register in + loads and stores means something different from what we + would like. */ + int ool_adjust = (saving_GPRs_inline + ? 0 + : (info->first_gp_reg_save + - (FIRST_SAVRES_REGISTER+1))*8); + HOST_WIDE_INT offset = (info->spe_gp_save_offset + + sp_offset - ool_adjust); + + if (using_static_chain_p) + { + rtx r0 = gen_rtx_REG (Pmode, 0); + gcc_assert (info->first_gp_reg_save > 11); + + emit_move_insn (r0, gen_rtx_REG (Pmode, 11)); + } + + spe_save_area_ptr = gen_rtx_REG (Pmode, 11); + insn = emit_insn (gen_addsi3 (spe_save_area_ptr, + frame_reg_rtx, + GEN_INT (offset))); + /* We need to make sure the move to r11 gets noted for + properly outputting unwind information. */ + if (!saving_GPRs_inline) + rs6000_frame_related (insn, frame_reg_rtx, offset, + NULL_RTX, NULL_RTX); + spe_offset = 0; + } + + if (saving_GPRs_inline) + { + for (i = 0; i < 32 - info->first_gp_reg_save; i++) + if (rs6000_reg_live_or_pic_offset_p (info->first_gp_reg_save + i)) + { + rtx reg = gen_rtx_REG (reg_mode, info->first_gp_reg_save + i); + rtx offset, addr, mem; + + /* We're doing all this to ensure that the offset fits into + the immediate offset of 'evstdd'. */ + gcc_assert (SPE_CONST_OFFSET_OK (reg_size * i + spe_offset)); + + offset = GEN_INT (reg_size * i + spe_offset); + addr = gen_rtx_PLUS (Pmode, spe_save_area_ptr, offset); + mem = gen_rtx_MEM (V2SImode, addr); + + insn = emit_move_insn (mem, reg); + + rs6000_frame_related (insn, spe_save_area_ptr, + info->spe_gp_save_offset + + sp_offset + reg_size * i, + offset, const0_rtx); + } + } + else + { + rtx par; + + par = rs6000_make_savres_rtx (info, gen_rtx_REG (Pmode, 11), + 0, reg_mode, + /*savep=*/true, /*gpr=*/true, + /*exitp=*/false); + insn = emit_insn (par); + rs6000_frame_related (insn, frame_ptr_rtx, info->total_size, + NULL_RTX, NULL_RTX); + } + + + /* Move the static chain pointer back. */ + if (using_static_chain_p && !spe_regs_addressable_via_sp) + emit_move_insn (gen_rtx_REG (Pmode, 11), gen_rtx_REG (Pmode, 0)); + } + else if (!WORLD_SAVE_P (info) && !saving_GPRs_inline) + { + rtx par; + + /* Need to adjust r11 if we saved any FPRs. */ + if (info->first_fp_reg_save != 64) + { + rtx r11 = gen_rtx_REG (reg_mode, 11); + rtx offset = GEN_INT (info->total_size + + (-8 * (64-info->first_fp_reg_save))); + rtx ptr_reg = (sp_reg_rtx == frame_reg_rtx + ? sp_reg_rtx : r11); + + emit_insn (gen_add3_insn (r11, ptr_reg, offset)); + } + + par = rs6000_make_savres_rtx (info, frame_reg_rtx, + info->gp_save_offset + sp_offset, + reg_mode, + /*savep=*/true, /*gpr=*/true, + /*exitp=*/false); + insn = emit_insn (par); + rs6000_frame_related (insn, frame_ptr_rtx, info->total_size, + NULL_RTX, NULL_RTX); + } + else if (!WORLD_SAVE_P (info) && using_store_multiple) { rtvec p; int i; @@ -15953,80 +18618,6 @@ rs6000_emit_prologue (void) rs6000_frame_related (insn, frame_ptr_rtx, info->total_size, NULL_RTX, NULL_RTX); } - else if (!WORLD_SAVE_P (info) - && TARGET_SPE_ABI - && info->spe_64bit_regs_used != 0 - && info->first_gp_reg_save != 32) - { - int i; - rtx spe_save_area_ptr; - int using_static_chain_p = (cfun->static_chain_decl != NULL_TREE - && df_regs_ever_live_p (STATIC_CHAIN_REGNUM) - && !call_used_regs[STATIC_CHAIN_REGNUM]); - - /* Determine whether we can address all of the registers that need - to be saved with an offset from the stack pointer that fits in - the small const field for SPE memory instructions. */ - int spe_regs_addressable_via_sp - = SPE_CONST_OFFSET_OK(info->spe_gp_save_offset + sp_offset - + (32 - info->first_gp_reg_save - 1) * reg_size); - int spe_offset; - - if (spe_regs_addressable_via_sp) - { - spe_save_area_ptr = frame_reg_rtx; - spe_offset = info->spe_gp_save_offset + sp_offset; - } - else - { - /* Make r11 point to the start of the SPE save area. We need - to be careful here if r11 is holding the static chain. If - it is, then temporarily save it in r0. We would use r0 as - our base register here, but using r0 as a base register in - loads and stores means something different from what we - would like. */ - if (using_static_chain_p) - { - rtx r0 = gen_rtx_REG (Pmode, 0); - - gcc_assert (info->first_gp_reg_save > 11); - - emit_move_insn (r0, gen_rtx_REG (Pmode, 11)); - } - - spe_save_area_ptr = gen_rtx_REG (Pmode, 11); - emit_insn (gen_addsi3 (spe_save_area_ptr, frame_reg_rtx, - GEN_INT (info->spe_gp_save_offset + sp_offset))); - - spe_offset = 0; - } - - for (i = 0; i < 32 - info->first_gp_reg_save; i++) - if (rs6000_reg_live_or_pic_offset_p (info->first_gp_reg_save + i)) - { - rtx reg = gen_rtx_REG (reg_mode, info->first_gp_reg_save + i); - rtx offset, addr, mem; - - /* We're doing all this to ensure that the offset fits into - the immediate offset of 'evstdd'. */ - gcc_assert (SPE_CONST_OFFSET_OK (reg_size * i + spe_offset)); - - offset = GEN_INT (reg_size * i + spe_offset); - addr = gen_rtx_PLUS (Pmode, spe_save_area_ptr, offset); - mem = gen_rtx_MEM (V2SImode, addr); - - insn = emit_move_insn (mem, reg); - - rs6000_frame_related (insn, spe_save_area_ptr, - info->spe_gp_save_offset - + sp_offset + reg_size * i, - offset, const0_rtx); - } - - /* Move the static chain pointer back. */ - if (using_static_chain_p && !spe_regs_addressable_via_sp) - emit_move_insn (gen_rtx_REG (Pmode, 11), gen_rtx_REG (Pmode, 0)); - } else if (!WORLD_SAVE_P (info)) { int i; @@ -16126,7 +18717,8 @@ rs6000_emit_prologue (void) (frame_reg_rtx != sp_reg_rtx && ((info->altivec_size != 0) || (info->vrsave_mask != 0) - ))); + )), + FALSE); if (frame_reg_rtx != sp_reg_rtx) rs6000_emit_stack_tie (); } @@ -16247,7 +18839,7 @@ rs6000_emit_prologue (void) && flag_pic && current_function_uses_pic_offset_table) { rtx lr = gen_rtx_REG (Pmode, LR_REGNO); - rtx src = machopic_function_base_sym (); + rtx src = gen_rtx_SYMBOL_REF (Pmode, MACHOPIC_FUNCTION_BASE_NAME); /* Save and restore LR locally around this call (in R0). */ if (!info->lr_save_p) @@ -16282,8 +18874,7 @@ rs6000_output_function_prologue (FILE *f && !FP_SAVE_INLINE (info->first_fp_reg_save)) fprintf (file, "\t.extern %s%d%s\n\t.extern %s%d%s\n", SAVE_FP_PREFIX, info->first_fp_reg_save - 32, SAVE_FP_SUFFIX, - RESTORE_FP_PREFIX, info->first_fp_reg_save - 32, - RESTORE_FP_SUFFIX); + RESTORE_FP_PREFIX, info->first_fp_reg_save - 32, RESTORE_FP_SUFFIX); /* Write .extern for AIX common mode routines, if needed. */ if (! TARGET_POWER && ! TARGET_POWERPC && ! common_mode_defined) @@ -16337,6 +18928,54 @@ rs6000_output_function_prologue (FILE *f we restore after the pop when possible. */ #define ALWAYS_RESTORE_ALTIVEC_BEFORE_POP 0 +/* Reload CR from REG. */ + +static void +rs6000_restore_saved_cr (rtx reg, int using_mfcr_multiple) +{ + int count = 0; + int i; + + if (using_mfcr_multiple) + { + for (i = 0; i < 8; i++) + if (df_regs_ever_live_p (CR0_REGNO+i) && ! call_used_regs[CR0_REGNO+i]) + count++; + gcc_assert (count); + } + + if (using_mfcr_multiple && count > 1) + { + rtvec p; + int ndx; + + p = rtvec_alloc (count); + + ndx = 0; + for (i = 0; i < 8; i++) + if (df_regs_ever_live_p (CR0_REGNO+i) && ! call_used_regs[CR0_REGNO+i]) + { + rtvec r = rtvec_alloc (2); + RTVEC_ELT (r, 0) = reg; + RTVEC_ELT (r, 1) = GEN_INT (1 << (7-i)); + RTVEC_ELT (p, ndx) = + gen_rtx_SET (VOIDmode, gen_rtx_REG (CCmode, CR0_REGNO+i), + gen_rtx_UNSPEC (CCmode, r, UNSPEC_MOVESI_TO_CR)); + ndx++; + } + emit_insn (gen_rtx_PARALLEL (VOIDmode, p)); + gcc_assert (ndx == count); + } + else + for (i = 0; i < 8; i++) + if (df_regs_ever_live_p (CR0_REGNO+i) && ! call_used_regs[CR0_REGNO+i]) + { + emit_insn (gen_movsi_to_cr_one (gen_rtx_REG (CCmode, + CR0_REGNO+i), + reg)); + } +} + /* Emit function epilogue as insns. At present, dwarf2out_frame_debug_expr doesn't understand @@ -16348,10 +18987,13 @@ void rs6000_emit_epilogue (int sibcall) { rs6000_stack_t *info; + int restoring_GPRs_inline; int restoring_FPRs_inline; int using_load_multiple; int using_mtcr_multiple; int use_backchain_to_restore_sp; + int restore_lr; + int strategy; int sp_offset = 0; rtx sp_reg_rtx = gen_rtx_REG (Pmode, 1); rtx frame_reg_rtx = sp_reg_rtx; @@ -16367,15 +19009,15 @@ rs6000_emit_epilogue (int sibcall) reg_size = 8; } - using_load_multiple = (TARGET_MULTIPLE && ! TARGET_POWERPC64 - && (!TARGET_SPE_ABI - || info->spe_64bit_regs_used == 0) - && info->first_gp_reg_save < 31 - && no_global_regs_above (info->first_gp_reg_save)); - restoring_FPRs_inline = (sibcall - || current_function_calls_eh_return - || info->first_fp_reg_save == 64 - || FP_SAVE_INLINE (info->first_fp_reg_save)); + strategy = rs6000_savres_strategy (info, /*savep=*/false, + /*static_chain_p=*/0, sibcall); + using_load_multiple = strategy & SAVRES_MULTIPLE; + restoring_FPRs_inline = strategy & SAVRES_INLINE_FPRS; + restoring_GPRs_inline = strategy & SAVRES_INLINE_GPRS; + using_mtcr_multiple = (rs6000_cpu == PROCESSOR_PPC601 + || rs6000_cpu == PROCESSOR_PPC603 + || rs6000_cpu == PROCESSOR_PPC750 + || optimize_size); /* Restore via the backchain when we have a large frame, since this is more efficient than an addis, addi pair. The second condition here will not trigger at the moment; We don't actually need a @@ -16387,10 +19029,9 @@ rs6000_emit_epilogue (int sibcall) > 32767 || (cfun->calls_alloca && !frame_pointer_needed)); - using_mtcr_multiple = (rs6000_cpu == PROCESSOR_PPC601 - || rs6000_cpu == PROCESSOR_PPC603 - || rs6000_cpu == PROCESSOR_PPC750 - || optimize_size); + restore_lr = (info->lr_save_p + && restoring_GPRs_inline + && restoring_FPRs_inline); if (WORLD_SAVE_P (info)) { @@ -16460,11 +19101,14 @@ rs6000_emit_epilogue (int sibcall) } for (i = 0; info->first_fp_reg_save + i <= 63; i++) { - rtx reg = gen_rtx_REG (DFmode, info->first_fp_reg_save + i); + rtx reg = gen_rtx_REG (((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) + ? DFmode : SFmode), + info->first_fp_reg_save + i); rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, GEN_INT (info->fp_save_offset + 8 * i)); - rtx mem = gen_frame_mem (DFmode, addr); + rtx mem = gen_frame_mem (((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) + ? DFmode : SFmode), addr); RTVEC_ELT (p, j++) = gen_rtx_SET (VOIDmode, reg, mem); } @@ -16657,8 +19301,9 @@ rs6000_emit_epilogue (int sibcall) emit_insn (generate_set_vrsave (reg, info, 1)); } - /* Get the old lr if we saved it. */ - if (info->lr_save_p) + /* Get the old lr if we saved it. If we are restoring registers + out-of-line, then the out-of-line routines can do this for us. */ + if (restore_lr) { rtx mem = gen_frame_mem_offset (Pmode, frame_reg_rtx, info->lr_save_offset + sp_offset); @@ -16677,7 +19322,7 @@ rs6000_emit_epilogue (int sibcall) } /* Set LR here to try to overlap restores below. */ - if (info->lr_save_p) + if (restore_lr) emit_move_insn (gen_rtx_REG (Pmode, LR_REGNO), gen_rtx_REG (Pmode, 0)); @@ -16713,35 +19358,17 @@ rs6000_emit_epilogue (int sibcall) /* Restore GPRs. This is done as a PARALLEL if we are using the load-multiple instructions. */ - if (using_load_multiple) - { - rtvec p; - p = rtvec_alloc (32 - info->first_gp_reg_save); - for (i = 0; i < 32 - info->first_gp_reg_save; i++) - { - rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, - GEN_INT (info->gp_save_offset - + sp_offset - + reg_size * i)); - rtx mem = gen_frame_mem (reg_mode, addr); - - RTVEC_ELT (p, i) = - gen_rtx_SET (VOIDmode, - gen_rtx_REG (reg_mode, info->first_gp_reg_save + i), - mem); - } - emit_insn (gen_rtx_PARALLEL (VOIDmode, p)); - } - else if (TARGET_SPE_ABI - && info->spe_64bit_regs_used != 0 - && info->first_gp_reg_save != 32) + if (TARGET_SPE_ABI + && info->spe_64bit_regs_used != 0 + && info->first_gp_reg_save != 32) { /* Determine whether we can address all of the registers that need to be saved with an offset from the stack pointer that fits in the small const field for SPE memory instructions. */ int spe_regs_addressable_via_sp - = SPE_CONST_OFFSET_OK(info->spe_gp_save_offset + sp_offset - + (32 - info->first_gp_reg_save - 1) * reg_size); + = (SPE_CONST_OFFSET_OK(info->spe_gp_save_offset + sp_offset + + (32 - info->first_gp_reg_save - 1) * reg_size) + && restoring_GPRs_inline); int spe_offset; if (spe_regs_addressable_via_sp) @@ -16753,10 +19380,17 @@ rs6000_emit_epilogue (int sibcall) not clobbering it when we were saving registers in the prologue. There's no need to worry here because the static chain is passed anew to every function. */ + int ool_adjust = (restoring_GPRs_inline + ? 0 + : (info->first_gp_reg_save + - (FIRST_SAVRES_REGISTER+1))*8); + if (frame_reg_rtx == sp_reg_rtx) frame_reg_rtx = gen_rtx_REG (Pmode, 11); emit_insn (gen_addsi3 (frame_reg_rtx, old_frame_reg_rtx, - GEN_INT (info->spe_gp_save_offset + sp_offset))); + GEN_INT (info->spe_gp_save_offset + + sp_offset + - ool_adjust))); /* Keep the invariant that frame_reg_rtx + sp_offset points at the top of the stack frame. */ sp_offset = -info->spe_gp_save_offset; @@ -16764,26 +19398,80 @@ rs6000_emit_epilogue (int sibcall) spe_offset = 0; } - for (i = 0; i < 32 - info->first_gp_reg_save; i++) - if (rs6000_reg_live_or_pic_offset_p (info->first_gp_reg_save + i)) - { - rtx offset, addr, mem; + if (restoring_GPRs_inline) + { + for (i = 0; i < 32 - info->first_gp_reg_save; i++) + if (rs6000_reg_live_or_pic_offset_p (info->first_gp_reg_save + i)) + { + rtx offset, addr, mem; - /* We're doing all this to ensure that the immediate offset - fits into the immediate field of 'evldd'. */ - gcc_assert (SPE_CONST_OFFSET_OK (spe_offset + reg_size * i)); - - offset = GEN_INT (spe_offset + reg_size * i); - addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, offset); - mem = gen_rtx_MEM (V2SImode, addr); + /* We're doing all this to ensure that the immediate offset + fits into the immediate field of 'evldd'. */ + gcc_assert (SPE_CONST_OFFSET_OK (spe_offset + reg_size * i)); + + offset = GEN_INT (spe_offset + reg_size * i); + addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, offset); + mem = gen_rtx_MEM (V2SImode, addr); - emit_move_insn (gen_rtx_REG (reg_mode, info->first_gp_reg_save + i), - mem); - } + emit_move_insn (gen_rtx_REG (reg_mode, info->first_gp_reg_save + i), + mem); + } + } + else + { + rtx par; + + par = rs6000_make_savres_rtx (info, gen_rtx_REG (Pmode, 11), + 0, reg_mode, + /*savep=*/false, /*gpr=*/true, + /*exitp=*/true); + emit_jump_insn (par); + + /* We don't want anybody else emitting things after we jumped + back. */ + return; + } } - else - for (i = 0; i < 32 - info->first_gp_reg_save; i++) - if (rs6000_reg_live_or_pic_offset_p (info->first_gp_reg_save + i)) + else if (!restoring_GPRs_inline) + { + /* We are jumping to an out-of-line function. */ + bool can_use_exit = info->first_fp_reg_save == 64; + rtx par; + + /* Emit stack reset code if we need it. */ + if (can_use_exit) + rs6000_emit_stack_reset (info, sp_reg_rtx, frame_reg_rtx, + sp_offset, can_use_exit); + else + emit_insn (gen_addsi3 (gen_rtx_REG (Pmode, 11), + sp_reg_rtx, + GEN_INT (sp_offset - info->fp_size))); + + par = rs6000_make_savres_rtx (info, frame_reg_rtx, + info->gp_save_offset, reg_mode, + /*savep=*/false, /*gpr=*/true, + /*exitp=*/can_use_exit); + + if (can_use_exit) + { + if (info->cr_save_p) + rs6000_restore_saved_cr (gen_rtx_REG (SImode, 12), + using_mtcr_multiple); + + emit_jump_insn (par); + + /* We don't want anybody else emitting things after we jumped + back. */ + return; + } + else + emit_insn (par); + } + else if (using_load_multiple) + { + rtvec p; + p = rtvec_alloc (32 - info->first_gp_reg_save); + for (i = 0; i < 32 - info->first_gp_reg_save; i++) { rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, GEN_INT (info->gp_save_offset @@ -16791,9 +19479,28 @@ rs6000_emit_epilogue (int sibcall) + reg_size * i)); rtx mem = gen_frame_mem (reg_mode, addr); - emit_move_insn (gen_rtx_REG (reg_mode, - info->first_gp_reg_save + i), mem); + RTVEC_ELT (p, i) = + gen_rtx_SET (VOIDmode, + gen_rtx_REG (reg_mode, info->first_gp_reg_save + i), + mem); } + emit_insn (gen_rtx_PARALLEL (VOIDmode, p)); + } + else + { + for (i = 0; i < 32 - info->first_gp_reg_save; i++) + if (rs6000_reg_live_or_pic_offset_p (info->first_gp_reg_save + i)) + { + rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, + GEN_INT (info->gp_save_offset + + sp_offset + + reg_size * i)); + rtx mem = gen_frame_mem (reg_mode, addr); + + emit_move_insn (gen_rtx_REG (reg_mode, + info->first_gp_reg_save + i), mem); + } + } /* Restore fpr's if we need to do it without calling a function. */ if (restoring_FPRs_inline) @@ -16806,78 +19513,24 @@ rs6000_emit_epilogue (int sibcall) GEN_INT (info->fp_save_offset + sp_offset + 8 * i)); - mem = gen_frame_mem (DFmode, addr); + mem = gen_frame_mem (((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) + ? DFmode : SFmode), addr); - emit_move_insn (gen_rtx_REG (DFmode, + emit_move_insn (gen_rtx_REG (((TARGET_HARD_FLOAT + && TARGET_DOUBLE_FLOAT) + ? DFmode : SFmode), info->first_fp_reg_save + i), mem); } /* If we saved cr, restore it here. Just those that were used. */ if (info->cr_save_p) - { - rtx r12_rtx = gen_rtx_REG (SImode, 12); - int count = 0; - - if (using_mtcr_multiple) - { - for (i = 0; i < 8; i++) - if (df_regs_ever_live_p (CR0_REGNO+i) && ! call_used_regs[CR0_REGNO+i]) - count++; - gcc_assert (count); - } - - if (using_mtcr_multiple && count > 1) - { - rtvec p; - int ndx; - - p = rtvec_alloc (count); - - ndx = 0; - for (i = 0; i < 8; i++) - if (df_regs_ever_live_p (CR0_REGNO+i) && ! call_used_regs[CR0_REGNO+i]) - { - rtvec r = rtvec_alloc (2); - RTVEC_ELT (r, 0) = r12_rtx; - RTVEC_ELT (r, 1) = GEN_INT (1 << (7-i)); - RTVEC_ELT (p, ndx) = - gen_rtx_SET (VOIDmode, gen_rtx_REG (CCmode, CR0_REGNO+i), - gen_rtx_UNSPEC (CCmode, r, UNSPEC_MOVESI_TO_CR)); - ndx++; - } - emit_insn (gen_rtx_PARALLEL (VOIDmode, p)); - gcc_assert (ndx == count); - } - else - for (i = 0; i < 8; i++) - if (df_regs_ever_live_p (CR0_REGNO+i) && ! call_used_regs[CR0_REGNO+i]) - { - emit_insn (gen_movsi_to_cr_one (gen_rtx_REG (CCmode, - CR0_REGNO+i), - r12_rtx)); - } - } + rs6000_restore_saved_cr (gen_rtx_REG (SImode, 12), using_mtcr_multiple); /* If this is V.4, unwind the stack pointer after all of the loads have been done. */ - if (frame_reg_rtx != sp_reg_rtx) - { - /* This blockage is needed so that sched doesn't decide to move - the sp change before the register restores. */ - rs6000_emit_stack_tie (); - if (sp_offset != 0) - emit_insn (gen_addsi3 (sp_reg_rtx, frame_reg_rtx, - GEN_INT (sp_offset))); - else - emit_move_insn (sp_reg_rtx, frame_reg_rtx); - } - else if (sp_offset != 0) - emit_insn (TARGET_32BIT - ? gen_addsi3 (sp_reg_rtx, sp_reg_rtx, - GEN_INT (sp_offset)) - : gen_adddi3 (sp_reg_rtx, sp_reg_rtx, - GEN_INT (sp_offset))); + rs6000_emit_stack_reset (info, sp_reg_rtx, frame_reg_rtx, + sp_offset, !restoring_FPRs_inline); if (current_function_calls_eh_return) { @@ -16891,30 +19544,30 @@ rs6000_emit_epilogue (int sibcall) { rtvec p; if (! restoring_FPRs_inline) - p = rtvec_alloc (3 + 64 - info->first_fp_reg_save); + p = rtvec_alloc (4 + 64 - info->first_fp_reg_save); else p = rtvec_alloc (2); RTVEC_ELT (p, 0) = gen_rtx_RETURN (VOIDmode); - RTVEC_ELT (p, 1) = gen_rtx_USE (VOIDmode, - gen_rtx_REG (Pmode, - LR_REGNO)); + RTVEC_ELT (p, 1) = (restoring_FPRs_inline + ? gen_rtx_USE (VOIDmode, gen_rtx_REG (Pmode, 65)) + : gen_rtx_CLOBBER (VOIDmode, + gen_rtx_REG (Pmode, 65))); /* If we have to restore more than two FP registers, branch to the restore function. It will return to our caller. */ if (! restoring_FPRs_inline) { int i; - char rname[30]; - const char *alloc_rname; - - sprintf (rname, "%s%d%s", RESTORE_FP_PREFIX, - info->first_fp_reg_save - 32, RESTORE_FP_SUFFIX); - alloc_rname = ggc_strdup (rname); - RTVEC_ELT (p, 2) = gen_rtx_USE (VOIDmode, - gen_rtx_SYMBOL_REF (Pmode, - alloc_rname)); + rtx sym; + sym = rs6000_savres_routine_sym (info, + /*savep=*/false, + /*gpr=*/false, + /*exitp=*/true); + RTVEC_ELT (p, 2) = gen_rtx_USE (VOIDmode, sym); + RTVEC_ELT (p, 3) = gen_rtx_USE (VOIDmode, + gen_rtx_REG (Pmode, 11)); for (i = 0; i < 64 - info->first_fp_reg_save; i++) { rtx addr, mem; @@ -16922,7 +19575,7 @@ rs6000_emit_epilogue (int sibcall) GEN_INT (info->fp_save_offset + 8*i)); mem = gen_frame_mem (DFmode, addr); - RTVEC_ELT (p, i+3) = + RTVEC_ELT (p, i+4) = gen_rtx_SET (VOIDmode, gen_rtx_REG (DFmode, info->first_fp_reg_save + i), mem); @@ -17009,7 +19662,7 @@ rs6000_output_function_epilogue (FILE *f System V.4 Powerpc's (and the embedded ABI derived from it) use a different traceback table. */ if (DEFAULT_ABI == ABI_AIX && ! flag_inhibit_size_directive - && rs6000_traceback != traceback_none && !current_function_is_thunk) + && rs6000_traceback != traceback_none && !cfun->is_thunk) { const char *fname = NULL; const char *language_string = lang_hooks.name; @@ -17267,7 +19920,7 @@ rs6000_output_mi_thunk (FILE *file, tree HOST_WIDE_INT delta, HOST_WIDE_INT vcall_offset, tree function) { - rtx this, insn, funexp; + rtx this_rtx, insn, funexp; reload_completed = 1; epilogue_completed = 1; @@ -17278,18 +19931,13 @@ rs6000_output_mi_thunk (FILE *file, tree /* Find the "this" pointer. If the function returns a structure, the structure return pointer is in r3. */ if (aggregate_value_p (TREE_TYPE (TREE_TYPE (function)), function)) - this = gen_rtx_REG (Pmode, 4); + this_rtx = gen_rtx_REG (Pmode, 4); else - this = gen_rtx_REG (Pmode, 3); + this_rtx = gen_rtx_REG (Pmode, 3); /* Apply the constant offset, if required. */ if (delta) - { - rtx delta_rtx = GEN_INT (delta); - emit_insn (TARGET_32BIT - ? gen_addsi3 (this, this, delta_rtx) - : gen_adddi3 (this, this, delta_rtx)); - } + emit_insn (gen_add3_insn (this_rtx, this_rtx, GEN_INT (delta))); /* Apply the offset from the vtable, if required. */ if (vcall_offset) @@ -17297,12 +19945,10 @@ rs6000_output_mi_thunk (FILE *file, tree rtx vcall_offset_rtx = GEN_INT (vcall_offset); rtx tmp = gen_rtx_REG (Pmode, 12); - emit_move_insn (tmp, gen_rtx_MEM (Pmode, this)); + emit_move_insn (tmp, gen_rtx_MEM (Pmode, this_rtx)); if (((unsigned HOST_WIDE_INT) vcall_offset) + 0x8000 >= 0x10000) { - emit_insn (TARGET_32BIT - ? gen_addsi3 (tmp, tmp, vcall_offset_rtx) - : gen_adddi3 (tmp, tmp, vcall_offset_rtx)); + emit_insn (gen_add3_insn (tmp, tmp, vcall_offset_rtx)); emit_move_insn (tmp, gen_rtx_MEM (Pmode, tmp)); } else @@ -17311,9 +19957,7 @@ rs6000_output_mi_thunk (FILE *file, tree emit_move_insn (tmp, gen_rtx_MEM (Pmode, loc)); } - emit_insn (TARGET_32BIT - ? gen_addsi3 (this, this, tmp) - : gen_adddi3 (this, this, tmp)); + emit_insn (gen_add3_insn (this_rtx, this_rtx, tmp)); } /* Generate a tail call to the target function. */ @@ -17355,6 +19999,7 @@ rs6000_output_mi_thunk (FILE *file, tree final_start_function (insn, file, 1); final (insn, file, 1); final_end_function (); + free_after_compilation (cfun); reload_completed = 0; epilogue_completed = 0; @@ -17497,6 +20142,35 @@ toc_hash_eq (const void *h1, const void || strncmp ("_ZTI", name, strlen ("_ZTI")) == 0 \ || strncmp ("_ZTC", name, strlen ("_ZTC")) == 0) +#ifdef NO_DOLLAR_IN_LABEL +/* Return a GGC-allocated character string translating dollar signs in + input NAME to underscores. Used by XCOFF ASM_OUTPUT_LABELREF. */ + +const char * +rs6000_xcoff_strip_dollar (const char *name) +{ + char *strip, *p; + int len; + + p = strchr (name, '$'); + + if (p == 0 || p == name) + return name; + + len = strlen (name); + strip = (char *) alloca (len + 1); + strcpy (strip, name); + p = strchr (strip, '$'); + while (p) + { + *p = '_'; + p = strchr (p + 1, '$'); + } + + return ggc_alloc_string (strip, len); +} +#endif + void rs6000_output_symbol_ref (FILE *file, rtx x) { @@ -17524,7 +20198,6 @@ output_toc (FILE *file, rtx x, int label { char buf[256]; const char *name = buf; - const char *real_name; rtx base = x; HOST_WIDE_INT offset = 0; @@ -17545,12 +20218,12 @@ output_toc (FILE *file, rtx x, int label toc_hash_table = htab_create_ggc (1021, toc_hash_function, toc_hash_eq, NULL); - h = ggc_alloc (sizeof (*h)); + h = GGC_NEW (struct toc_hash_struct); h->key = x; h->key_mode = mode; h->labelno = labelno; - found = htab_find_slot (toc_hash_table, h, 1); + found = htab_find_slot (toc_hash_table, h, INSERT); if (*found == NULL) *found = h; else /* This is indeed a duplicate. @@ -17773,7 +20446,8 @@ output_toc (FILE *file, rtx x, int label if (GET_CODE (x) == CONST) { - gcc_assert (GET_CODE (XEXP (x, 0)) == PLUS); + gcc_assert (GET_CODE (XEXP (x, 0)) == PLUS + && GET_CODE (XEXP (XEXP (x, 0), 1)) == CONST_INT); base = XEXP (XEXP (x, 0), 0); offset = INTVAL (XEXP (XEXP (x, 0), 1)); @@ -17798,12 +20472,12 @@ output_toc (FILE *file, rtx x, int label gcc_unreachable (); } - real_name = (*targetm.strip_name_encoding) (name); if (TARGET_MINIMAL_TOC) fputs (TARGET_32BIT ? "\t.long " : DOUBLE_INT_ASM_OP, file); else { - fprintf (file, "\t.tc %s", real_name); + fputs ("\t.tc ", file); + RS6000_OUTPUT_BASENAME (file, name); if (offset < 0) fprintf (file, ".N" HOST_WIDE_INT_PRINT_UNSIGNED, - offset); @@ -17971,7 +20645,8 @@ output_profile_hook (int labelno ATTRIBU # define NO_PROFILE_COUNTERS 0 #endif if (NO_PROFILE_COUNTERS) - emit_library_call (init_one_libfunc (RS6000_MCOUNT), 0, VOIDmode, 0); + emit_library_call (init_one_libfunc (RS6000_MCOUNT), + LCT_NORMAL, VOIDmode, 0); else { char buf[30]; @@ -17982,8 +20657,8 @@ output_profile_hook (int labelno ATTRIBU label_name = (*targetm.strip_name_encoding) (ggc_strdup (buf)); fun = gen_rtx_SYMBOL_REF (Pmode, label_name); - emit_library_call (init_one_libfunc (RS6000_MCOUNT), 0, VOIDmode, 1, - fun, Pmode); + emit_library_call (init_one_libfunc (RS6000_MCOUNT), + LCT_NORMAL, VOIDmode, 1, fun, Pmode); } } else if (DEFAULT_ABI == ABI_DARWIN) @@ -17998,11 +20673,11 @@ output_profile_hook (int labelno ATTRIBU /* For PIC code, set up a stub and collect the caller's address from r0, which is where the prologue puts it. */ if (MACHOPIC_INDIRECT - && current_function_uses_pic_offset_table) + && crtl->uses_pic_offset_table) caller_addr_regno = 0; #endif emit_library_call (gen_rtx_SYMBOL_REF (Pmode, mcount_name), - 0, VOIDmode, 1, + LCT_NORMAL, VOIDmode, 1, gen_rtx_REG (Pmode, caller_addr_regno), Pmode); } } @@ -18224,6 +20899,7 @@ rs6000_adjust_cost (rtx insn, rtx link, || rs6000_cpu_attr == CPU_PPC7450 || rs6000_cpu_attr == CPU_POWER4 || rs6000_cpu_attr == CPU_POWER5 + || rs6000_cpu_attr == CPU_POWER7 || rs6000_cpu_attr == CPU_CELL) && recog_memoized (dep_insn) && (INSN_CODE (dep_insn) >= 0)) @@ -18238,7 +20914,7 @@ rs6000_adjust_cost (rtx insn, rtx link, case TYPE_FPCOMPARE: case TYPE_CR_LOGICAL: case TYPE_DELAYED_CR: - return cost + 2; + return cost + 2; default: break; } @@ -18283,7 +20959,7 @@ rs6000_adjust_cost (rtx insn, rtx link, if (! store_data_bypass_p (dep_insn, insn)) return 6; break; - } + } case TYPE_INTEGER: case TYPE_COMPARE: case TYPE_FAST_COMPARE: @@ -18329,7 +21005,7 @@ rs6000_adjust_cost (rtx insn, rtx link, break; } } - break; + break; case TYPE_LOAD: case TYPE_LOAD_U: @@ -18424,7 +21100,7 @@ rs6000_adjust_cost (rtx insn, rtx link, break; } - /* Fall out to return default cost. */ + /* Fall out to return default cost. */ } break; @@ -18463,6 +21139,35 @@ rs6000_adjust_cost (rtx insn, rtx link, return cost; } +/* Debug version of rs6000_adjust_cost. */ + +static int +rs6000_debug_adjust_cost (rtx insn, rtx link, rtx dep_insn, int cost) +{ + int ret = rs6000_adjust_cost (insn, link, dep_insn, cost); + + if (ret != cost) + { + const char *dep; + + switch (REG_NOTE_KIND (link)) + { + default: dep = "unknown depencency"; break; + case REG_DEP_TRUE: dep = "data dependency"; break; + case REG_DEP_OUTPUT: dep = "output dependency"; break; + case REG_DEP_ANTI: dep = "anti depencency"; break; + } + + fprintf (stderr, + "\nrs6000_adjust_cost, final cost = %d, orig cost = %d, " + "%s, insn:\n", ret, cost, dep); + + debug_rtx (insn); + } + + return ret; +} + /* The function returns a true if INSN is microcoded. Return false otherwise. */ @@ -18731,6 +21436,9 @@ rs6000_issue_rate (void) case CPU_PPC7400: case CPU_PPC8540: case CPU_CELL: + case CPU_PPCE300C2: + case CPU_PPCE300C3: + case CPU_PPCE500MC: return 2; case CPU_RIOS2: case CPU_PPC604: @@ -18741,6 +21449,7 @@ rs6000_issue_rate (void) case CPU_POWER4: case CPU_POWER5: case CPU_POWER6: + case CPU_POWER7: return 5; default: return 1; @@ -19117,7 +21826,7 @@ rs6000_sched_reorder2 (FILE *dump, int s while (pos >= 0) { if (is_load_insn (ready[pos]) - && INSN_PRIORITY_KNOWN (ready[pos])) + && INSN_PRIORITY_KNOWN (ready[pos])) { INSN_PRIORITY (ready[pos])++; @@ -19161,6 +21870,7 @@ rs6000_sched_reorder2 (FILE *dump, int s ready[*pn_ready-1] = tmp; if INSN_PRIORITY_KNOWN (tmp) INSN_PRIORITY (tmp)++; + first_store_pos = -1; break; @@ -19193,7 +21903,7 @@ rs6000_sched_reorder2 (FILE *dump, int s while (pos >= 0) { if (is_store_insn (ready[pos]) - && INSN_PRIORITY_KNOWN (ready[pos])) + && INSN_PRIORITY_KNOWN (ready[pos])) { INSN_PRIORITY (ready[pos])++; @@ -19337,6 +22047,41 @@ insn_must_be_first_in_group (rtx insn) break; } break; + case PROCESSOR_POWER7: + type = get_attr_type (insn); + + switch (type) + { + case TYPE_CR_LOGICAL: + case TYPE_MFCR: + case TYPE_MFCRF: + case TYPE_MTCR: + case TYPE_IDIV: + case TYPE_LDIV: + case TYPE_COMPARE: + case TYPE_DELAYED_COMPARE: + case TYPE_VAR_DELAYED_COMPARE: + case TYPE_ISYNC: + case TYPE_LOAD_L: + case TYPE_STORE_C: + case TYPE_LOAD_U: + case TYPE_LOAD_UX: + case TYPE_LOAD_EXT: + case TYPE_LOAD_EXT_U: + case TYPE_LOAD_EXT_UX: + case TYPE_STORE_U: + case TYPE_STORE_UX: + case TYPE_FPLOAD_U: + case TYPE_FPLOAD_UX: + case TYPE_FPSTORE_U: + case TYPE_FPSTORE_UX: + case TYPE_MFJMPR: + case TYPE_MTJMPR: + return true; + default: + break; + } + break; default: break; } @@ -19398,6 +22143,23 @@ insn_must_be_last_in_group (rtx insn) break; } break; + case PROCESSOR_POWER7: + type = get_attr_type (insn); + + switch (type) + { + case TYPE_ISYNC: + case TYPE_SYNC: + case TYPE_LOAD_L: + case TYPE_STORE_C: + case TYPE_LOAD_EXT_U: + case TYPE_LOAD_EXT_UX: + case TYPE_STORE_UX: + return true; + default: + break; + } + break; default: break; } @@ -19601,7 +22363,7 @@ redefine_groups (FILE *dump, int sched_v /* Initialize. */ issue_rate = rs6000_issue_rate (); - group_insns = alloca (issue_rate * sizeof (rtx)); + group_insns = XALLOCAVEC (rtx, issue_rate); for (i = 0; i < issue_rate; i++) { group_insns[i] = 0; @@ -19695,7 +22457,7 @@ pad_groups (FILE *dump, int sched_verbos if (group_end) { /* If the scheduler had marked group termination at this location - (between insn and next_indn), and neither insn nor next_insn will + (between insn and next_insn), and neither insn nor next_insn will force group termination, pad the group with nops to force group termination. */ if (can_issue_more @@ -19769,6 +22531,7 @@ rs6000_sched_finish (FILE *dump, int sch } } } + /* Length in units of the trampoline for entering a nested function. */ @@ -19832,7 +22595,7 @@ rs6000_initialize_trampoline (rtx addr, case ABI_DARWIN: case ABI_V4: emit_library_call (gen_rtx_SYMBOL_REF (Pmode, "__trampoline_setup"), - FALSE, VOIDmode, 4, + LCT_NORMAL, VOIDmode, 4, addr, Pmode, GEN_INT (rs6000_trampoline_size ()), SImode, fnaddr, Pmode, @@ -19895,19 +22658,7 @@ rs6000_handle_altivec_attribute (tree *n mode = TYPE_MODE (type); /* Check for invalid AltiVec type qualifiers. */ - if (type == long_unsigned_type_node || type == long_integer_type_node) - { - if (TARGET_64BIT) - error ("use of %<long%> in AltiVec types is invalid for 64-bit code"); - else if (rs6000_warn_altivec_long) - warning (0, "use of %<long%> in AltiVec types is deprecated; use %<int%>"); - } - else if (type == long_long_unsigned_type_node - || type == long_long_integer_type_node) - error ("use of %<long long%> in AltiVec types is invalid"); - else if (type == double_type_node) - error ("use of %<double%> in AltiVec types is invalid"); - else if (type == long_double_type_node) + if (type == long_double_type_node) error ("use of %<long double%> in AltiVec types is invalid"); else if (type == boolean_type_node) error ("use of boolean types in AltiVec types is invalid"); @@ -19915,6 +22666,24 @@ rs6000_handle_altivec_attribute (tree *n error ("use of %<complex%> in AltiVec types is invalid"); else if (DECIMAL_FLOAT_MODE_P (mode)) error ("use of decimal floating point types in AltiVec types is invalid"); + else if (!TARGET_VSX) + { + if (type == long_unsigned_type_node || type == long_integer_type_node) + { + if (TARGET_64BIT) + error ("use of %<long%> in AltiVec types is invalid for " + "64-bit code without -mvsx"); + else if (rs6000_warn_altivec_long) + warning (0, "use of %<long%> in AltiVec types is deprecated; " + "use %<int%>"); + } + else if (type == long_long_unsigned_type_node + || type == long_long_integer_type_node) + error ("use of %<long long%> in AltiVec types is invalid without " + "-mvsx"); + else if (type == double_type_node) + error ("use of %<double%> in AltiVec types is invalid without -mvsx"); + } switch (altivec_type) { @@ -19922,6 +22691,9 @@ rs6000_handle_altivec_attribute (tree *n unsigned_p = TYPE_UNSIGNED (type); switch (mode) { + case DImode: + result = (unsigned_p ? unsigned_V2DI_type_node : V2DI_type_node); + break; case SImode: result = (unsigned_p ? unsigned_V4SI_type_node : V4SI_type_node); break; @@ -19932,10 +22704,12 @@ rs6000_handle_altivec_attribute (tree *n result = (unsigned_p ? unsigned_V16QI_type_node : V16QI_type_node); break; case SFmode: result = V4SF_type_node; break; + case DFmode: result = V2DF_type_node; break; /* If the user says 'vector int bool', we may be handed the 'bool' attribute _before_ the 'vector' attribute, and so select the proper type in the 'b' case below. */ case V4SImode: case V8HImode: case V16QImode: case V4SFmode: + case V2DImode: case V2DFmode: result = type; default: break; } @@ -19943,6 +22717,7 @@ rs6000_handle_altivec_attribute (tree *n case 'b': switch (mode) { + case DImode: case V2DImode: result = bool_V2DI_type_node; break; case SImode: case V4SImode: result = bool_V4SI_type_node; break; case HImode: case V8HImode: result = bool_V8HI_type_node; break; case QImode: case V16QImode: result = bool_V16QI_type_node; @@ -19987,6 +22762,7 @@ rs6000_mangle_type (const_tree type) if (type == bool_short_type_node) return "U6__bools"; if (type == pixel_type_node) return "u7__pixel"; if (type == bool_int_type_node) return "U6__booli"; + if (type == bool_long_type_node) return "U6__booll"; /* Mangle IBM extended float long double as `g' (__float128) on powerpc*-linux where long-double-64 previously was the default. */ @@ -20199,7 +22975,7 @@ rs6000_elf_encode_section_info (tree dec { rtx sym_ref = XEXP (rtl, 0); size_t len = strlen (XSTR (sym_ref, 0)); - char *str = alloca (len + 2); + char *str = XALLOCAVEC (char, len + 2); str[0] = '.'; memcpy (str + 1, XSTR (sym_ref, 0), len + 1); XSTR (sym_ref, 0) = ggc_alloc_string (str, len + 1); @@ -20207,12 +22983,12 @@ rs6000_elf_encode_section_info (tree dec } static inline bool -compare_section_name (const char *section, const char *template) +compare_section_name (const char *section, const char *templ) { int len; - len = strlen (template); - return (strncmp (section, template, len) == 0 + len = strlen (templ); + return (strncmp (section, templ, len) == 0 && (section[len] == 0 || section[len] == '.')); } @@ -20498,10 +23274,10 @@ machopic_output_stub (FILE *file, const length = strlen (symb); - symbol_name = alloca (length + 32); + symbol_name = XALLOCAVEC (char, length + 32); GEN_SYMBOL_NAME_FOR_SYMBOL (symbol_name, symb, length); - lazy_ptr_name = alloca (length + 32); + lazy_ptr_name = XALLOCAVEC (char, length + 32); GEN_LAZY_PTR_NAME_FOR_SYMBOL (lazy_ptr_name, symb, length); if (flag_pic == 2) @@ -20517,7 +23293,7 @@ machopic_output_stub (FILE *file, const fprintf (file, "\t.indirect_symbol %s\n", symbol_name); label++; - local_label_0 = alloca (sizeof ("\"L00000000000$spb\"")); + local_label_0 = XALLOCAVEC (char, sizeof ("\"L00000000000$spb\"")); sprintf (local_label_0, "\"L%011d$spb\"", label); fprintf (file, "\tmflr r0\n"); @@ -21390,7 +24166,7 @@ rs6000_rtx_costs (rtx x, int code, int o case CALL: case IF_THEN_ELSE: - if (optimize_size) + if (!optimize_size) { *total = COSTS_N_INSNS (1); return true; @@ -21452,6 +24228,40 @@ rs6000_rtx_costs (rtx x, int code, int o return false; } +/* Debug form of r6000_rtx_costs that is selected if -mdebug=cost. */ + +static bool +rs6000_debug_rtx_costs (rtx x, int code, int outer_code, int *total) +{ + bool ret = rs6000_rtx_costs (x, code, outer_code, total); + + fprintf (stderr, + "\nrs6000_rtx_costs, return = %s, code = %s, outer_code = %s, " + "total = %d, x:\n", + ret ? "complete" : "scan inner", + GET_RTX_NAME (code), + GET_RTX_NAME (outer_code), + *total); + + debug_rtx (x); + + return ret; +} + +/* Debug form of ADDRESS_COST that is selected if -mdebug=cost. */ + +static int +rs6000_debug_address_cost (rtx x) +{ + int ret = TARGET_ADDRESS_COST (x); + + fprintf (stderr, "\nrs6000_address_cost, return = %d, x:\n", ret); + debug_rtx (x); + + return ret; +} + + /* A C expression returning the cost of moving data from a register of class CLASS1 to one of CLASS2. */ @@ -21459,6 +24269,8 @@ int rs6000_register_move_cost (enum machine_mode mode, enum reg_class from, enum reg_class to) { + int ret; + /* Moves from/to GENERAL_REGS. */ if (reg_classes_intersect_p (to, GENERAL_REGS) || reg_classes_intersect_p (from, GENERAL_REGS)) @@ -21466,51 +24278,74 @@ rs6000_register_move_cost (enum machine_ if (! reg_classes_intersect_p (to, GENERAL_REGS)) from = to; - if (from == FLOAT_REGS || from == ALTIVEC_REGS) - return (rs6000_memory_move_cost (mode, from, 0) - + rs6000_memory_move_cost (mode, GENERAL_REGS, 0)); + if (from == FLOAT_REGS || from == ALTIVEC_REGS || from == VSX_REGS) + ret = (rs6000_memory_move_cost (mode, from, 0) + + rs6000_memory_move_cost (mode, GENERAL_REGS, 0)); /* It's more expensive to move CR_REGS than CR0_REGS because of the shift. */ else if (from == CR_REGS) - return 4; + ret = 4; /* Power6 has slower LR/CTR moves so make them more expensive than memory in order to bias spills to memory .*/ else if (rs6000_cpu == PROCESSOR_POWER6 && reg_classes_intersect_p (from, LINK_OR_CTR_REGS)) - return 6 * hard_regno_nregs[0][mode]; + ret = 6 * hard_regno_nregs[0][mode]; else /* A move will cost one instruction per GPR moved. */ - return 2 * hard_regno_nregs[0][mode]; + ret = 2 * hard_regno_nregs[0][mode]; } + /* If we have VSX, we can easily move between FPR or Altivec registers. */ + else if (VECTOR_UNIT_VSX_P (mode) + && reg_classes_intersect_p (to, VSX_REGS) + && reg_classes_intersect_p (from, VSX_REGS)) + ret = 2 * hard_regno_nregs[32][mode]; + /* Moving between two similar registers is just one instruction. */ else if (reg_classes_intersect_p (to, from)) - return (mode == TFmode || mode == TDmode) ? 4 : 2; + ret = (mode == TFmode || mode == TDmode) ? 4 : 2; /* Everything else has to go through GENERAL_REGS. */ else - return (rs6000_register_move_cost (mode, GENERAL_REGS, to) - + rs6000_register_move_cost (mode, from, GENERAL_REGS)); + ret = (rs6000_register_move_cost (mode, GENERAL_REGS, to) + + rs6000_register_move_cost (mode, from, GENERAL_REGS)); + + if (TARGET_DEBUG_COST) + fprintf (stderr, + "rs6000_register_move_cost:, ret=%d, mode=%s, from=%s, to=%s\n", + ret, GET_MODE_NAME (mode), reg_class_names[from], + reg_class_names[to]); + + return ret; } /* A C expressions returning the cost of moving data of MODE from a register to or from memory. */ int -rs6000_memory_move_cost (enum machine_mode mode, enum reg_class class, +rs6000_memory_move_cost (enum machine_mode mode, enum reg_class rclass, int in ATTRIBUTE_UNUSED) { - if (reg_classes_intersect_p (class, GENERAL_REGS)) - return 4 * hard_regno_nregs[0][mode]; - else if (reg_classes_intersect_p (class, FLOAT_REGS)) - return 4 * hard_regno_nregs[32][mode]; - else if (reg_classes_intersect_p (class, ALTIVEC_REGS)) - return 4 * hard_regno_nregs[FIRST_ALTIVEC_REGNO][mode]; - else - return 4 + rs6000_register_move_cost (mode, class, GENERAL_REGS); + int ret; + + if (reg_classes_intersect_p (rclass, GENERAL_REGS)) + ret = 4 * hard_regno_nregs[0][mode]; + else if (reg_classes_intersect_p (rclass, FLOAT_REGS)) + ret = 4 * hard_regno_nregs[32][mode]; + else if (reg_classes_intersect_p (rclass, ALTIVEC_REGS)) + ret = 4 * hard_regno_nregs[FIRST_ALTIVEC_REGNO][mode]; + else + ret = 4 + rs6000_register_move_cost (mode, rclass, GENERAL_REGS); + + if (TARGET_DEBUG_COST) + fprintf (stderr, + "rs6000_memory_move_cost: ret=%d, mode=%s, rclass=%s, in=%d\n", + ret, GET_MODE_NAME (mode), reg_class_names[rclass], in); + + return ret; } /* Returns a code for a target-specific builtin that implements @@ -21726,8 +24561,8 @@ rs6000_emit_swrsqrtsf (rtx dst, rtx src) emit_label (XEXP (label, 0)); } -/* Emit popcount intrinsic on TARGET_POPCNTB targets. DST is the - target, and SRC is the argument operand. */ +/* Emit popcount intrinsic on TARGET_POPCNTB (Power5) and TARGET_POPCNTD + (Power7) targets. DST is the target, and SRC is the argument operand. */ void rs6000_emit_popcount (rtx dst, rtx src) @@ -21735,6 +24570,16 @@ rs6000_emit_popcount (rtx dst, rtx src) enum machine_mode mode = GET_MODE (dst); rtx tmp1, tmp2; + /* Use the PPC ISA 2.06 popcnt{w,d} instruction if we can. */ + if (TARGET_POPCNTD) + { + if (mode == SImode) + emit_insn (gen_popcntwsi2 (dst, src)); + else + emit_insn (gen_popcntddi2 (dst, src)); + return; + } + tmp1 = gen_reg_rtx (mode); if (mode == SImode) @@ -21930,7 +24775,8 @@ rs6000_function_value (const_tree valtyp if (DECIMAL_FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS) /* _Decimal128 must use an even/odd register pair. */ regno = (mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN; - else if (SCALAR_FLOAT_TYPE_P (valtype) && TARGET_HARD_FLOAT && TARGET_FPRS) + else if (SCALAR_FLOAT_TYPE_P (valtype) && TARGET_HARD_FLOAT && TARGET_FPRS + && ((TARGET_SINGLE_FLOAT && (mode == SFmode)) || TARGET_DOUBLE_FLOAT)) regno = FP_ARG_RETURN; else if (TREE_CODE (valtype) == COMPLEX_TYPE && targetm.calls.split_complex_arg) @@ -21939,9 +24785,13 @@ rs6000_function_value (const_tree valtyp && TARGET_ALTIVEC && TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode)) regno = ALTIVEC_ARG_RETURN; + else if (TREE_CODE (valtype) == VECTOR_TYPE + && TARGET_VSX && TARGET_ALTIVEC_ABI + && VSX_VECTOR_MODE (mode)) + regno = ALTIVEC_ARG_RETURN; else if (TARGET_E500_DOUBLE && TARGET_HARD_FLOAT - && (mode == DFmode || mode == DDmode || mode == DCmode - || mode == TFmode || mode == TDmode || mode == TCmode)) + && (mode == DFmode || mode == DCmode + || mode == TFmode || mode == TCmode)) return spe_build_register_parallel (mode, GP_ARG_RETURN); else regno = GP_ARG_RETURN; @@ -21974,16 +24824,20 @@ rs6000_libcall_value (enum machine_mode /* _Decimal128 must use an even/odd register pair. */ regno = (mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN; else if (SCALAR_FLOAT_MODE_P (mode) - && TARGET_HARD_FLOAT && TARGET_FPRS) + && TARGET_HARD_FLOAT && TARGET_FPRS + && ((TARGET_SINGLE_FLOAT && mode == SFmode) || TARGET_DOUBLE_FLOAT)) regno = FP_ARG_RETURN; else if (ALTIVEC_VECTOR_MODE (mode) && TARGET_ALTIVEC && TARGET_ALTIVEC_ABI) regno = ALTIVEC_ARG_RETURN; + else if (VSX_VECTOR_MODE (mode) + && TARGET_VSX && TARGET_ALTIVEC_ABI) + regno = ALTIVEC_ARG_RETURN; else if (COMPLEX_MODE_P (mode) && targetm.calls.split_complex_arg) return rs6000_complex_function_value (mode); else if (TARGET_E500_DOUBLE && TARGET_HARD_FLOAT - && (mode == DFmode || mode == DDmode || mode == DCmode - || mode == TFmode || mode == TDmode || mode == TCmode)) + && (mode == DFmode || mode == DCmode + || mode == TFmode || mode == TCmode)) return spe_build_register_parallel (mode, GP_ARG_RETURN); else regno = GP_ARG_RETURN; @@ -22030,7 +24884,6 @@ rs6000_is_opaque_type (const_tree type) { return (type == opaque_V2SI_type_node || type == opaque_V2SF_type_node - || type == opaque_p_V2SI_type_node || type == opaque_V4SI_type_node); } @@ -22148,7 +25001,7 @@ rs6000_vector_mode_supported_p (enum mac if (TARGET_SPE && SPE_VECTOR_MODE (mode)) return true; - else if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode)) + else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)) return true; else Index: gcc-4.3.4-20091019/gcc/doc/extend.texi =================================================================== --- gcc-4.3.4-20091019.orig/gcc/doc/extend.texi 2009-10-19 13:39:51.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/doc/extend.texi 2009-10-19 13:40:37.000000000 +0200 @@ -1,5 +1,6 @@ @c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1996, 1998, 1999, 2000, 2001, -@c 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. +@c 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 +@c Free Software Foundation, Inc. @c This is part of the GCC manual. @c For copying conditions, see the file gcc.texi. @@ -6713,7 +6714,7 @@ instructions, but allow the compiler to * X86 Built-in Functions:: * MIPS DSP Built-in Functions:: * MIPS Paired-Single Support:: -* PowerPC AltiVec Built-in Functions:: +* PowerPC AltiVec/VSX Built-in Functions:: * SPARC VIS Built-in Functions:: * SPU Built-in Functions:: @end menu @@ -8851,7 +8852,7 @@ else @end smallexample @end table -@node PowerPC AltiVec Built-in Functions +@node PowerPC AltiVec/VSX Built-in Functions @subsection PowerPC AltiVec Built-in Functions GCC provides an interface for the PowerPC family of processors to access @@ -8877,6 +8878,19 @@ vector bool int vector float @end smallexample +If @option{-mvsx} is used the following additional vector types are +implemented. + +@smallexample +vector unsigned long +vector signed long +vector double +@end smallexample + +The long types are only implemented for 64-bit code generation, and +the long type is only used in the floating point/integer conversion +instructions. + GCC's implementation of the high-level language interface available from C and C++ code differs from Motorola's documentation in several ways. @@ -9142,6 +9156,8 @@ vector signed char vec_vavgsb (vector si vector unsigned char vec_vavgub (vector unsigned char, vector unsigned char); +vector float vec_copysign (vector float); + vector float vec_ceil (vector float); vector signed int vec_cmpb (vector float, vector float); @@ -10744,6 +10760,92 @@ int vec_any_numeric (vector float); int vec_any_out (vector float, vector float); @end smallexample +If the vector/scalar (VSX) instruction set is available, the following +additional functions are available: + +@smallexample +vector double vec_abs (vector double); +vector double vec_add (vector double, vector double); +vector double vec_and (vector double, vector double); +vector double vec_and (vector double, vector bool long); +vector double vec_and (vector bool long, vector double); +vector double vec_andc (vector double, vector double); +vector double vec_andc (vector double, vector bool long); +vector double vec_andc (vector bool long, vector double); +vector double vec_ceil (vector double); +vector bool long vec_cmpeq (vector double, vector double); +vector bool long vec_cmpge (vector double, vector double); +vector bool long vec_cmpgt (vector double, vector double); +vector bool long vec_cmple (vector double, vector double); +vector bool long vec_cmplt (vector double, vector double); +vector float vec_div (vector float, vector float); +vector double vec_div (vector double, vector double); +vector double vec_floor (vector double); +vector double vec_madd (vector double, vector double, vector double); +vector double vec_max (vector double, vector double); +vector double vec_min (vector double, vector double); +vector float vec_msub (vector float, vector float, vector float); +vector double vec_msub (vector double, vector double, vector double); +vector float vec_mul (vector float, vector float); +vector double vec_mul (vector double, vector double); +vector float vec_nearbyint (vector float); +vector double vec_nearbyint (vector double); +vector float vec_nmadd (vector float, vector float, vector float); +vector double vec_nmadd (vector double, vector double, vector double); +vector double vec_nmsub (vector double, vector double, vector double); +vector double vec_nor (vector double, vector double); +vector double vec_or (vector double, vector double); +vector double vec_or (vector double, vector bool long); +vector double vec_or (vector bool long, vector double); +vector double vec_perm (vector double, + vector double, + vector unsigned char); +vector float vec_rint (vector float); +vector double vec_rint (vector double); +vector double vec_sel (vector double, vector double, vector bool long); +vector double vec_sel (vector double, vector double, vector unsigned long); +vector double vec_sub (vector double, vector double); +vector float vec_sqrt (vector float); +vector double vec_sqrt (vector double); +vector double vec_trunc (vector double); +vector double vec_xor (vector double, vector double); +vector double vec_xor (vector double, vector bool long); +vector double vec_xor (vector bool long, vector double); +int vec_all_eq (vector double, vector double); +int vec_all_ge (vector double, vector double); +int vec_all_gt (vector double, vector double); +int vec_all_le (vector double, vector double); +int vec_all_lt (vector double, vector double); +int vec_all_nan (vector double); +int vec_all_ne (vector double, vector double); +int vec_all_nge (vector double, vector double); +int vec_all_ngt (vector double, vector double); +int vec_all_nle (vector double, vector double); +int vec_all_nlt (vector double, vector double); +int vec_all_numeric (vector double); +int vec_any_eq (vector double, vector double); +int vec_any_ge (vector double, vector double); +int vec_any_gt (vector double, vector double); +int vec_any_le (vector double, vector double); +int vec_any_lt (vector double, vector double); +int vec_any_nan (vector double); +int vec_any_ne (vector double, vector double); +int vec_any_nge (vector double, vector double); +int vec_any_ngt (vector double, vector double); +int vec_any_nle (vector double, vector double); +int vec_any_nlt (vector double, vector double); +int vec_any_numeric (vector double); +@end smallexample + +GCC provides a few other builtins on Powerpc to access certain instructions: +@smallexample +float __builtin_recipdivf (float, float); +float __builtin_rsqrtf (float); +double __builtin_recipdiv (double, double); +long __builtin_bpermd (long, long); +int __builtin_bswap16 (int); +@end smallexample + @node SPARC VIS Built-in Functions @subsection SPARC VIS Built-in Functions Index: gcc-4.3.4-20091019/gcc/doc/invoke.texi =================================================================== --- gcc-4.3.4-20091019.orig/gcc/doc/invoke.texi 2009-10-19 13:39:52.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/doc/invoke.texi 2009-10-19 13:40:37.000000000 +0200 @@ -1,5 +1,5 @@ @c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, -@c 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 +@c 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 @c Free Software Foundation, Inc. @c This is part of the GCC manual. @c For copying conditions, see the file gcc.texi. @@ -11,7 +11,7 @@ @c man begin COPYRIGHT Copyright @copyright{} 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, -1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 +1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document @@ -685,19 +685,22 @@ See RS/6000 and PowerPC Options. -maltivec -mno-altivec @gol -mpowerpc-gpopt -mno-powerpc-gpopt @gol -mpowerpc-gfxopt -mno-powerpc-gfxopt @gol --mmfcrf -mno-mfcrf -mpopcntb -mno-popcntb -mfprnd -mno-fprnd @gol +-mmfcrf -mno-mfcrf -mpopcntb -mno-popcntb -mpopcntd -mno-popcntd @gol +-mfprnd -mno-fprnd @gol -mcmpb -mno-cmpb -mmfpgpr -mno-mfpgpr -mhard-dfp -mno-hard-dfp @gol -mnew-mnemonics -mold-mnemonics @gol -mfull-toc -mminimal-toc -mno-fp-in-toc -mno-sum-in-toc @gol -m64 -m32 -mxl-compat -mno-xl-compat -mpe @gol -malign-power -malign-natural @gol -msoft-float -mhard-float -mmultiple -mno-multiple @gol +-msingle-float -mdouble-float -msimple-fpu @gol -mstring -mno-string -mupdate -mno-update @gol +-mavoid-indexed-addresses -mno-avoid-indexed-addresses @gol -mfused-madd -mno-fused-madd -mbit-align -mno-bit-align @gol -mstrict-align -mno-strict-align -mrelocatable @gol -mno-relocatable -mrelocatable-lib -mno-relocatable-lib @gol -mtoc -mno-toc -mlittle -mlittle-endian -mbig -mbig-endian @gol --mdynamic-no-pic -maltivec -mswdiv @gol +-mdynamic-no-pic -maltivec -mswdiv @gol -mprioritize-restricted-insns=@var{priority} @gol -msched-costly-dep=@var{dependence_type} @gol -minsert-sched-nops=@var{scheme} @gol @@ -716,7 +719,7 @@ See RS/6000 and PowerPC Options. -mfloat-gprs=yes -mfloat-gprs=no -mfloat-gprs=single -mfloat-gprs=double @gol -mprototype -mno-prototype @gol -msim -mmvme -mads -myellowknife -memb -msdata @gol --msdata=@var{opt} -mvxworks -mwindiss -G @var{num} -pthread} +-msdata=@var{opt} -mvxworks -G @var{num} -pthread} @emph{S/390 and zSeries Options} @gccoptlist{-mtune=@var{cpu-type} -march=@var{cpu-type} @gol @@ -12831,6 +12834,8 @@ These @samp{-m} options are defined for @itemx -mno-mfcrf @itemx -mpopcntb @itemx -mno-popcntb +@itemx -mpopcntd +@itemx -mno-popcntd @itemx -mfprnd @itemx -mno-fprnd @itemx -mcmpb @@ -12855,6 +12860,8 @@ These @samp{-m} options are defined for @opindex mno-mfcrf @opindex mpopcntb @opindex mno-popcntb +@opindex mpopcntd +@opindex mno-popcntd @opindex mfprnd @opindex mno-fprnd @opindex mcmpb @@ -12904,6 +12911,9 @@ The @option{-mpopcntb} option allows GCC double precision FP reciprocal estimate instruction implemented on the POWER5 processor and other processors that support the PowerPC V2.02 architecture. +The @option{-mpopcntd} option allows GCC to generate the popcount +instruction implemented on the POWER7 processor and other processors +that support the PowerPC V2.06 architecture. The @option{-mfprnd} option allows GCC to generate the FP round to integer instructions implemented on the POWER5+ processor and other processors that support the PowerPC V2.03 architecture. @@ -12951,16 +12961,16 @@ should normally not specify either @opti Set architecture type, register usage, choice of mnemonics, and instruction scheduling parameters for machine type @var{cpu_type}. Supported values for @var{cpu_type} are @samp{401}, @samp{403}, -@samp{405}, @samp{405fp}, @samp{440}, @samp{440fp}, @samp{505}, -@samp{601}, @samp{602}, @samp{603}, @samp{603e}, @samp{604}, +@samp{405}, @samp{405fp}, @samp{440}, @samp{440fp}, @samp{464}, @samp{464fp}, +@samp{505}, @samp{601}, @samp{602}, @samp{603}, @samp{603e}, @samp{604}, @samp{604e}, @samp{620}, @samp{630}, @samp{740}, @samp{7400}, @samp{7450}, @samp{750}, @samp{801}, @samp{821}, @samp{823}, -@samp{860}, @samp{970}, @samp{8540}, @samp{ec603e}, @samp{G3}, -@samp{G4}, @samp{G5}, @samp{power}, @samp{power2}, @samp{power3}, -@samp{power4}, @samp{power5}, @samp{power5+}, @samp{power6}, -@samp{power6x}, @samp{power7}, -@samp{common}, @samp{powerpc}, @samp{powerpc64}, -@samp{rios}, @samp{rios1}, @samp{rios2}, @samp{rsc}, and @samp{rs64}. +@samp{860}, @samp{970}, @samp{8540}, @samp{e300c2}, @samp{e300c3}, +@samp{e500mc}, @samp{ec603e}, @samp{G3}, @samp{G4}, @samp{G5}, +@samp{power}, @samp{power2}, @samp{power3}, @samp{power4}, +@samp{power5}, @samp{power5+}, @samp{power6}, @samp{power6x}, @samp{power7} +@samp{common}, @samp{powerpc}, @samp{powerpc64}, @samp{rios}, +@samp{rios1}, @samp{rios2}, @samp{rsc}, and @samp{rs64}. @option{-mcpu=common} selects a completely generic processor. Code generated under this option will run on any POWER or PowerPC processor. @@ -12982,8 +12992,9 @@ The @option{-mcpu} options automatically following options: @gccoptlist{-maltivec -mfprnd -mhard-float -mmfcrf -mmultiple @gol --mnew-mnemonics -mpopcntb -mpower -mpower2 -mpowerpc64 @gol --mpowerpc-gpopt -mpowerpc-gfxopt -mstring -mmulhw -mdlmzb -mmfpgpr} +-mnew-mnemonics -mpopcntb -mpopcntd -mpower -mpower2 -mpowerpc64 @gol +-mpowerpc-gpopt -mpowerpc-gfxopt -msingle-float -mdouble-float @gol +-msimple-fpu -mstring -mmulhw -mdlmzb -mmfpgpr -mvsx} The particular options set for any particular CPU will vary between compiler versions, depending on what setting seems to produce optimal @@ -13030,7 +13041,7 @@ the AltiVec instruction set. You may al enhancements. @item -mvrsave -@item -mno-vrsave +@itemx -mno-vrsave @opindex mvrsave @opindex mno-vrsave Generate VRSAVE instructions when generating AltiVec code. @@ -13084,6 +13095,14 @@ instructions. This option has been deprecated. Use @option{-mspe} and @option{-mno-spe} instead. +@item -mvsx +@itemx -mno-vsx +@opindex mvsx +@opindex mno-vsx +Generate code that uses (does not use) vector/scalar (VSX) +instructions, and also enable the use of built-in functions that allow +more direct access to the VSX instruction set. + @item -mfloat-gprs=@var{yes/single/double/no} @itemx -mfloat-gprs @opindex mfloat-gprs @@ -13209,6 +13228,28 @@ Generate code that does not use (uses) t Software floating point emulation is provided if you use the @option{-msoft-float} option, and pass the option to GCC when linking. +@item -msingle-float +@itemx -mdouble-float +@opindex msingle-float +@opindex mdouble-float +Generate code for single or double-precision floating point operations. +@option{-mdouble-float} implies @option{-msingle-float}. + +@item -msimple-fpu +@opindex msimple-fpu +Do not generate sqrt and div instructions for hardware floating point unit. + +@item -mfpu +@opindex mfpu +Specify type of floating point unit. Valid values are @var{sp_lite} +(equivalent to -msingle-float -msimple-fpu), @var{dp_lite} (equivalent +to -mdouble-float -msimple-fpu), @var{sp_full} (equivalent to -msingle-float), +and @var{dp_full} (equivalent to -mdouble-float). + +@item -mxilinx-fpu +@opindex mxilinx-fpu +Perform optimizations for floating point unit on Xilinx PPC 405/440. + @item -mmultiple @itemx -mno-multiple @opindex mmultiple @@ -13246,6 +13287,16 @@ stack pointer is updated and the address stored, which means code that walks the stack frame across interrupts or signals may get corrupted data. +@item -mavoid-indexed-addresses +@item -mno-avoid-indexed-addresses +@opindex mavoid-indexed-addresses +@opindex mno-avoid-indexed-addresses +Generate code that tries to avoid (not avoid) the use of indexed load +or store instructions. These instructions can incur a performance +penalty on Power6 processors in certain situations, such as when +stepping through large arrays that cross a 16M boundary. This option +is enabled by default when targetting Power6 and disabled otherwise. + @item -mfused-madd @itemx -mno-fused-madd @opindex mfused-madd @@ -13259,7 +13310,7 @@ hardware floating is used. @opindex mmulhw @opindex mno-mulhw Generate code that uses (does not use) the half-word multiply and -multiply-accumulate instructions on the IBM 405 and 440 processors. +multiply-accumulate instructions on the IBM 405, 440 and 464 processors. These instructions are generated by default when targetting those processors. @@ -13268,7 +13319,7 @@ processors. @opindex mdlmzb @opindex mno-dlmzb Generate code that uses (does not use) the string-search @samp{dlmzb} -instruction on the IBM 405 and 440 processors. This instruction is +instruction on the IBM 405, 440 and 464 processors. This instruction is generated by default when targetting those processors. @item -mno-bit-align @@ -13488,10 +13539,6 @@ On embedded PowerPC systems, assume that On System V.4 and embedded PowerPC systems, specify that you are compiling for a VxWorks system. -@item -mwindiss -@opindex mwindiss -Specify that you are compiling for the WindISS simulation environment. - @item -memb @opindex memb On embedded PowerPC systems, set the @var{PPC_EMB} bit in the ELF flags @@ -13543,8 +13590,8 @@ On System V.4 and embedded PowerPC syste compile code the same as @option{-msdata=eabi}, otherwise compile code the same as @option{-msdata=sysv}. -@item -msdata-data -@opindex msdata-data +@item -msdata=data +@opindex msdata=data On System V.4 and embedded PowerPC systems, put small global data in the @samp{.sdata} section. Put small uninitialized global data in the @samp{.sbss} section. Do not use register @code{r13} @@ -13611,6 +13658,16 @@ to use or discard it. In the future, we may cause GCC to ignore all longcall specifications when the linker is known to generate glue. +@item -mtls-markers +@itemx -mno-tls-markers +@opindex mtls-markers +@opindex mno-tls-markers +Mark (do not mark) calls to @code{__tls_get_addr} with a relocation +specifying the function argument. The relocation allows ld to +reliably associate function call with argument setup instructions for +TLS optimization, which in turn allows gcc to better schedule the +sequence. + @item -pthread @opindex pthread Adds support for multithreading with the @dfn{pthreads} library. Index: gcc-4.3.4-20091019/gcc/doc/md.texi =================================================================== --- gcc-4.3.4-20091019.orig/gcc/doc/md.texi 2009-10-19 13:39:51.000000000 +0200 +++ gcc-4.3.4-20091019/gcc/doc/md.texi 2009-10-19 13:40:37.000000000 +0200 @@ -1,5 +1,6 @@ @c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1996, 1998, 1999, 2000, 2001, -@c 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. +@c 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 +@c Free Software Foundation, Inc. @c This is part of the GCC manual. @c For copying conditions, see the file gcc.texi. @@ -1860,11 +1861,26 @@ A register indirect memory operand @item b Address base register +@item d +Floating point register (containing 64-bit value) + @item f -Floating point register +Floating point register (containing 32-bit value) @item v -Vector register +Altivec vector register + +@item wd +VSX vector register to hold vector double data + +@item wf +VSX vector register to hold vector float data + +@item ws +VSX vector register to hold scalar float data + +@item wa +Any VSX register @item h @samp{MQ}, @samp{CTR}, or @samp{LINK} register @@ -1920,13 +1936,40 @@ instruction per word Integer/Floating point constant that can be loaded into a register using three instructions +@item m +Memory operand. Note that on PowerPC targets, @code{m} can include +addresses that update the base register. It is therefore only safe +to use @samp{m} in an @code{asm} statement if that @code{asm} statement +accesses the operand exactly once. The @code{asm} statement must also +use @samp{%U@var{<opno>}} as a placeholder for the ``update'' flag in the +corresponding load or store instruction. For example: + +@smallexample +asm ("st%U0 %1,%0" : "=m" (mem) : "r" (val)); +@end smallexample + +is correct but: + +@smallexample +asm ("st %1,%0" : "=m" (mem) : "r" (val)); +@end smallexample + +is not. Use @code{es} rather than @code{m} if you don't want the +base register to be updated. + +@item es +A ``stable'' memory operand; that is, one which does not include any +automodification of the base register. Unlike @samp{m}, this constraint +can be used in @code{asm} statements that might access the operand +several times, or that might not access it at all. + @item Q -Memory operand that is an offset from a register (@samp{m} is preferable -for @code{asm} statements) +Memory operand that is an offset from a register (it is usually better +to use @samp{m} or @samp{es} in @code{asm} statements) @item Z -Memory operand that is an indexed or indirect from a register (@samp{m} is -preferable for @code{asm} statements) +Memory operand that is an indexed or indirect from a register (it is +usually better to use @samp{m} or @samp{es} in @code{asm} statements) @item R AIX TOC entry @@ -1950,6 +1993,9 @@ AND masks that can be performed by two r @item W Vector constant that does not require memory +@item j +Vector constant that is all zeros. + @end table @item MorphoTech family---@file{config/mt/mt.h}
Locations
Projects
Search
Status Monitor
Help
OpenBuildService.org
Documentation
API Documentation
Code of Conduct
Contact
Support
@OBShq
Terms
openSUSE Build Service is sponsored by
The Open Build Service is an
openSUSE project
.
Sign Up
Log In
Places
Places
All Projects
Status Monitor