xref: /dflybsd-src/contrib/gcc-8.0/gcc/bb-reorder.c (revision 95059079af47f9a66a175f374f2da1a5020e3255)
138fd1498Szrj /* Basic block reordering routines for the GNU compiler.
238fd1498Szrj    Copyright (C) 2000-2018 Free Software Foundation, Inc.
338fd1498Szrj 
438fd1498Szrj    This file is part of GCC.
538fd1498Szrj 
638fd1498Szrj    GCC is free software; you can redistribute it and/or modify it
738fd1498Szrj    under the terms of the GNU General Public License as published by
838fd1498Szrj    the Free Software Foundation; either version 3, or (at your option)
938fd1498Szrj    any later version.
1038fd1498Szrj 
1138fd1498Szrj    GCC is distributed in the hope that it will be useful, but WITHOUT
1238fd1498Szrj    ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
1338fd1498Szrj    or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
1438fd1498Szrj    License for more details.
1538fd1498Szrj 
1638fd1498Szrj    You should have received a copy of the GNU General Public License
1738fd1498Szrj    along with GCC; see the file COPYING3.  If not see
1838fd1498Szrj    <http://www.gnu.org/licenses/>.  */
1938fd1498Szrj 
2038fd1498Szrj /* This file contains the "reorder blocks" pass, which changes the control
2138fd1498Szrj    flow of a function to encounter fewer branches; the "partition blocks"
2238fd1498Szrj    pass, which divides the basic blocks into "hot" and "cold" partitions,
2338fd1498Szrj    which are kept separate; and the "duplicate computed gotos" pass, which
2438fd1498Szrj    duplicates blocks ending in an indirect jump.
2538fd1498Szrj 
2638fd1498Szrj    There are two algorithms for "reorder blocks": the "simple" algorithm,
2738fd1498Szrj    which just rearranges blocks, trying to minimize the number of executed
2838fd1498Szrj    unconditional branches; and the "software trace cache" algorithm, which
2938fd1498Szrj    also copies code, and in general tries a lot harder to have long linear
3038fd1498Szrj    pieces of machine code executed.  This algorithm is described next.  */
3138fd1498Szrj 
3238fd1498Szrj /* This (greedy) algorithm constructs traces in several rounds.
3338fd1498Szrj    The construction starts from "seeds".  The seed for the first round
3438fd1498Szrj    is the entry point of the function.  When there are more than one seed,
3538fd1498Szrj    the one with the lowest key in the heap is selected first (see bb_to_key).
3638fd1498Szrj    Then the algorithm repeatedly adds the most probable successor to the end
3738fd1498Szrj    of a trace.  Finally it connects the traces.
3838fd1498Szrj 
3938fd1498Szrj    There are two parameters: Branch Threshold and Exec Threshold.
4038fd1498Szrj    If the probability of an edge to a successor of the current basic block is
4138fd1498Szrj    lower than Branch Threshold or its count is lower than Exec Threshold,
4238fd1498Szrj    then the successor will be the seed in one of the next rounds.
4338fd1498Szrj    Each round has these parameters lower than the previous one.
4438fd1498Szrj    The last round has to have these parameters set to zero so that the
4538fd1498Szrj    remaining blocks are picked up.
4638fd1498Szrj 
4738fd1498Szrj    The algorithm selects the most probable successor from all unvisited
4838fd1498Szrj    successors and successors that have been added to this trace.
4938fd1498Szrj    The other successors (that has not been "sent" to the next round) will be
5038fd1498Szrj    other seeds for this round and the secondary traces will start from them.
5138fd1498Szrj    If the successor has not been visited in this trace, it is added to the
5238fd1498Szrj    trace (however, there is some heuristic for simple branches).
5338fd1498Szrj    If the successor has been visited in this trace, a loop has been found.
5438fd1498Szrj    If the loop has many iterations, the loop is rotated so that the source
5538fd1498Szrj    block of the most probable edge going out of the loop is the last block
5638fd1498Szrj    of the trace.
5738fd1498Szrj    If the loop has few iterations and there is no edge from the last block of
5838fd1498Szrj    the loop going out of the loop, the loop header is duplicated.
5938fd1498Szrj 
6038fd1498Szrj    When connecting traces, the algorithm first checks whether there is an edge
6138fd1498Szrj    from the last block of a trace to the first block of another trace.
6238fd1498Szrj    When there are still some unconnected traces it checks whether there exists
6338fd1498Szrj    a basic block BB such that BB is a successor of the last block of a trace
6438fd1498Szrj    and BB is a predecessor of the first block of another trace.  In this case,
6538fd1498Szrj    BB is duplicated, added at the end of the first trace and the traces are
6638fd1498Szrj    connected through it.
6738fd1498Szrj    The rest of traces are simply connected so there will be a jump to the
6838fd1498Szrj    beginning of the rest of traces.
6938fd1498Szrj 
7038fd1498Szrj    The above description is for the full algorithm, which is used when the
7138fd1498Szrj    function is optimized for speed.  When the function is optimized for size,
7238fd1498Szrj    in order to reduce long jumps and connect more fallthru edges, the
7338fd1498Szrj    algorithm is modified as follows:
7438fd1498Szrj    (1) Break long traces to short ones.  A trace is broken at a block that has
7538fd1498Szrj    multiple predecessors/ successors during trace discovery.  When connecting
7638fd1498Szrj    traces, only connect Trace n with Trace n + 1.  This change reduces most
7738fd1498Szrj    long jumps compared with the above algorithm.
7838fd1498Szrj    (2) Ignore the edge probability and count for fallthru edges.
7938fd1498Szrj    (3) Keep the original order of blocks when there is no chance to fall
8038fd1498Szrj    through.  We rely on the results of cfg_cleanup.
8138fd1498Szrj 
8238fd1498Szrj    To implement the change for code size optimization, block's index is
8338fd1498Szrj    selected as the key and all traces are found in one round.
8438fd1498Szrj 
8538fd1498Szrj    References:
8638fd1498Szrj 
8738fd1498Szrj    "Software Trace Cache"
8838fd1498Szrj    A. Ramirez, J. Larriba-Pey, C. Navarro, J. Torrellas and M. Valero; 1999
8938fd1498Szrj    http://citeseer.nj.nec.com/15361.html
9038fd1498Szrj 
9138fd1498Szrj */
9238fd1498Szrj 
9338fd1498Szrj #include "config.h"
9438fd1498Szrj #define INCLUDE_ALGORITHM /* stable_sort */
9538fd1498Szrj #include "system.h"
9638fd1498Szrj #include "coretypes.h"
9738fd1498Szrj #include "backend.h"
9838fd1498Szrj #include "target.h"
9938fd1498Szrj #include "rtl.h"
10038fd1498Szrj #include "tree.h"
10138fd1498Szrj #include "cfghooks.h"
10238fd1498Szrj #include "df.h"
10338fd1498Szrj #include "memmodel.h"
10438fd1498Szrj #include "optabs.h"
10538fd1498Szrj #include "regs.h"
10638fd1498Szrj #include "emit-rtl.h"
10738fd1498Szrj #include "output.h"
10838fd1498Szrj #include "expr.h"
10938fd1498Szrj #include "params.h"
11038fd1498Szrj #include "tree-pass.h"
11138fd1498Szrj #include "cfgrtl.h"
11238fd1498Szrj #include "cfganal.h"
11338fd1498Szrj #include "cfgbuild.h"
11438fd1498Szrj #include "cfgcleanup.h"
11538fd1498Szrj #include "bb-reorder.h"
11638fd1498Szrj #include "except.h"
11738fd1498Szrj #include "fibonacci_heap.h"
11838fd1498Szrj #include "stringpool.h"
11938fd1498Szrj #include "attribs.h"
120*58e805e6Szrj #include "common/common-target.h"
12138fd1498Szrj 
12238fd1498Szrj /* The number of rounds.  In most cases there will only be 4 rounds, but
12338fd1498Szrj    when partitioning hot and cold basic blocks into separate sections of
12438fd1498Szrj    the object file there will be an extra round.  */
12538fd1498Szrj #define N_ROUNDS 5
12638fd1498Szrj 
12738fd1498Szrj struct target_bb_reorder default_target_bb_reorder;
12838fd1498Szrj #if SWITCHABLE_TARGET
12938fd1498Szrj struct target_bb_reorder *this_target_bb_reorder = &default_target_bb_reorder;
13038fd1498Szrj #endif
13138fd1498Szrj 
13238fd1498Szrj #define uncond_jump_length \
13338fd1498Szrj   (this_target_bb_reorder->x_uncond_jump_length)
13438fd1498Szrj 
13538fd1498Szrj /* Branch thresholds in thousandths (per mille) of the REG_BR_PROB_BASE.  */
13638fd1498Szrj static const int branch_threshold[N_ROUNDS] = {400, 200, 100, 0, 0};
13738fd1498Szrj 
13838fd1498Szrj /* Exec thresholds in thousandths (per mille) of the count of bb 0.  */
13938fd1498Szrj static const int exec_threshold[N_ROUNDS] = {500, 200, 50, 0, 0};
14038fd1498Szrj 
14138fd1498Szrj /* If edge count is lower than DUPLICATION_THRESHOLD per mille of entry
14238fd1498Szrj    block the edge destination is not duplicated while connecting traces.  */
14338fd1498Szrj #define DUPLICATION_THRESHOLD 100
14438fd1498Szrj 
14538fd1498Szrj typedef fibonacci_heap <long, basic_block_def> bb_heap_t;
14638fd1498Szrj typedef fibonacci_node <long, basic_block_def> bb_heap_node_t;
14738fd1498Szrj 
14838fd1498Szrj /* Structure to hold needed information for each basic block.  */
14938fd1498Szrj struct bbro_basic_block_data
15038fd1498Szrj {
15138fd1498Szrj   /* Which trace is the bb start of (-1 means it is not a start of any).  */
15238fd1498Szrj   int start_of_trace;
15338fd1498Szrj 
15438fd1498Szrj   /* Which trace is the bb end of (-1 means it is not an end of any).  */
15538fd1498Szrj   int end_of_trace;
15638fd1498Szrj 
15738fd1498Szrj   /* Which trace is the bb in?  */
15838fd1498Szrj   int in_trace;
15938fd1498Szrj 
16038fd1498Szrj   /* Which trace was this bb visited in?  */
16138fd1498Szrj   int visited;
16238fd1498Szrj 
16338fd1498Szrj   /* Cached maximum frequency of interesting incoming edges.
16438fd1498Szrj      Minus one means not yet computed.  */
16538fd1498Szrj   int priority;
16638fd1498Szrj 
16738fd1498Szrj   /* Which heap is BB in (if any)?  */
16838fd1498Szrj   bb_heap_t *heap;
16938fd1498Szrj 
17038fd1498Szrj   /* Which heap node is BB in (if any)?  */
17138fd1498Szrj   bb_heap_node_t *node;
17238fd1498Szrj };
17338fd1498Szrj 
17438fd1498Szrj /* The current size of the following dynamic array.  */
17538fd1498Szrj static int array_size;
17638fd1498Szrj 
17738fd1498Szrj /* The array which holds needed information for basic blocks.  */
17838fd1498Szrj static bbro_basic_block_data *bbd;
17938fd1498Szrj 
18038fd1498Szrj /* To avoid frequent reallocation the size of arrays is greater than needed,
18138fd1498Szrj    the number of elements is (not less than) 1.25 * size_wanted.  */
18238fd1498Szrj #define GET_ARRAY_SIZE(X) ((((X) / 4) + 1) * 5)
18338fd1498Szrj 
18438fd1498Szrj /* Free the memory and set the pointer to NULL.  */
18538fd1498Szrj #define FREE(P) (gcc_assert (P), free (P), P = 0)
18638fd1498Szrj 
18738fd1498Szrj /* Structure for holding information about a trace.  */
18838fd1498Szrj struct trace
18938fd1498Szrj {
19038fd1498Szrj   /* First and last basic block of the trace.  */
19138fd1498Szrj   basic_block first, last;
19238fd1498Szrj 
19338fd1498Szrj   /* The round of the STC creation which this trace was found in.  */
19438fd1498Szrj   int round;
19538fd1498Szrj 
19638fd1498Szrj   /* The length (i.e. the number of basic blocks) of the trace.  */
19738fd1498Szrj   int length;
19838fd1498Szrj };
19938fd1498Szrj 
20038fd1498Szrj /* Maximum count of one of the entry blocks.  */
20138fd1498Szrj static profile_count max_entry_count;
20238fd1498Szrj 
20338fd1498Szrj /* Local function prototypes.  */
20438fd1498Szrj static void find_traces_1_round (int, profile_count, struct trace *, int *,
20538fd1498Szrj 				 int, bb_heap_t **, int);
20638fd1498Szrj static basic_block copy_bb (basic_block, edge, basic_block, int);
20738fd1498Szrj static long bb_to_key (basic_block);
20838fd1498Szrj static bool better_edge_p (const_basic_block, const_edge, profile_probability,
20938fd1498Szrj 			   profile_count, profile_probability, profile_count,
21038fd1498Szrj 			   const_edge);
21138fd1498Szrj static bool copy_bb_p (const_basic_block, int);
21238fd1498Szrj 
21338fd1498Szrj /* Return the trace number in which BB was visited.  */
21438fd1498Szrj 
21538fd1498Szrj static int
bb_visited_trace(const_basic_block bb)21638fd1498Szrj bb_visited_trace (const_basic_block bb)
21738fd1498Szrj {
21838fd1498Szrj   gcc_assert (bb->index < array_size);
21938fd1498Szrj   return bbd[bb->index].visited;
22038fd1498Szrj }
22138fd1498Szrj 
22238fd1498Szrj /* This function marks BB that it was visited in trace number TRACE.  */
22338fd1498Szrj 
22438fd1498Szrj static void
mark_bb_visited(basic_block bb,int trace)22538fd1498Szrj mark_bb_visited (basic_block bb, int trace)
22638fd1498Szrj {
22738fd1498Szrj   bbd[bb->index].visited = trace;
22838fd1498Szrj   if (bbd[bb->index].heap)
22938fd1498Szrj     {
23038fd1498Szrj       bbd[bb->index].heap->delete_node (bbd[bb->index].node);
23138fd1498Szrj       bbd[bb->index].heap = NULL;
23238fd1498Szrj       bbd[bb->index].node = NULL;
23338fd1498Szrj     }
23438fd1498Szrj }
23538fd1498Szrj 
23638fd1498Szrj /* Check to see if bb should be pushed into the next round of trace
23738fd1498Szrj    collections or not.  Reasons for pushing the block forward are 1).
23838fd1498Szrj    If the block is cold, we are doing partitioning, and there will be
23938fd1498Szrj    another round (cold partition blocks are not supposed to be
24038fd1498Szrj    collected into traces until the very last round); or 2). There will
24138fd1498Szrj    be another round, and the basic block is not "hot enough" for the
24238fd1498Szrj    current round of trace collection.  */
24338fd1498Szrj 
24438fd1498Szrj static bool
push_to_next_round_p(const_basic_block bb,int round,int number_of_rounds,profile_count count_th)24538fd1498Szrj push_to_next_round_p (const_basic_block bb, int round, int number_of_rounds,
24638fd1498Szrj 		      profile_count count_th)
24738fd1498Szrj {
24838fd1498Szrj   bool there_exists_another_round;
24938fd1498Szrj   bool block_not_hot_enough;
25038fd1498Szrj 
25138fd1498Szrj   there_exists_another_round = round < number_of_rounds - 1;
25238fd1498Szrj 
25338fd1498Szrj   block_not_hot_enough = (bb->count < count_th
25438fd1498Szrj 			  || probably_never_executed_bb_p (cfun, bb));
25538fd1498Szrj 
25638fd1498Szrj   if (there_exists_another_round
25738fd1498Szrj       && block_not_hot_enough)
25838fd1498Szrj     return true;
25938fd1498Szrj   else
26038fd1498Szrj     return false;
26138fd1498Szrj }
26238fd1498Szrj 
26338fd1498Szrj /* Find the traces for Software Trace Cache.  Chain each trace through
26438fd1498Szrj    RBI()->next.  Store the number of traces to N_TRACES and description of
26538fd1498Szrj    traces to TRACES.  */
26638fd1498Szrj 
26738fd1498Szrj static void
find_traces(int * n_traces,struct trace * traces)26838fd1498Szrj find_traces (int *n_traces, struct trace *traces)
26938fd1498Szrj {
27038fd1498Szrj   int i;
27138fd1498Szrj   int number_of_rounds;
27238fd1498Szrj   edge e;
27338fd1498Szrj   edge_iterator ei;
27438fd1498Szrj   bb_heap_t *heap = new bb_heap_t (LONG_MIN);
27538fd1498Szrj 
27638fd1498Szrj   /* Add one extra round of trace collection when partitioning hot/cold
27738fd1498Szrj      basic blocks into separate sections.  The last round is for all the
27838fd1498Szrj      cold blocks (and ONLY the cold blocks).  */
27938fd1498Szrj 
28038fd1498Szrj   number_of_rounds = N_ROUNDS - 1;
28138fd1498Szrj 
28238fd1498Szrj   /* Insert entry points of function into heap.  */
28338fd1498Szrj   max_entry_count = profile_count::zero ();
28438fd1498Szrj   FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs)
28538fd1498Szrj     {
28638fd1498Szrj       bbd[e->dest->index].heap = heap;
28738fd1498Szrj       bbd[e->dest->index].node = heap->insert (bb_to_key (e->dest), e->dest);
28838fd1498Szrj       if (e->dest->count > max_entry_count)
28938fd1498Szrj 	max_entry_count = e->dest->count;
29038fd1498Szrj     }
29138fd1498Szrj 
29238fd1498Szrj   /* Find the traces.  */
29338fd1498Szrj   for (i = 0; i < number_of_rounds; i++)
29438fd1498Szrj     {
29538fd1498Szrj       profile_count count_threshold;
29638fd1498Szrj 
29738fd1498Szrj       if (dump_file)
29838fd1498Szrj 	fprintf (dump_file, "STC - round %d\n", i + 1);
29938fd1498Szrj 
30038fd1498Szrj       count_threshold = max_entry_count.apply_scale (exec_threshold[i], 1000);
30138fd1498Szrj 
30238fd1498Szrj       find_traces_1_round (REG_BR_PROB_BASE * branch_threshold[i] / 1000,
30338fd1498Szrj 			   count_threshold, traces, n_traces, i, &heap,
30438fd1498Szrj 			   number_of_rounds);
30538fd1498Szrj     }
30638fd1498Szrj   delete heap;
30738fd1498Szrj 
30838fd1498Szrj   if (dump_file)
30938fd1498Szrj     {
31038fd1498Szrj       for (i = 0; i < *n_traces; i++)
31138fd1498Szrj 	{
31238fd1498Szrj 	  basic_block bb;
31338fd1498Szrj 	  fprintf (dump_file, "Trace %d (round %d):  ", i + 1,
31438fd1498Szrj 		   traces[i].round + 1);
31538fd1498Szrj 	  for (bb = traces[i].first;
31638fd1498Szrj 	       bb != traces[i].last;
31738fd1498Szrj 	       bb = (basic_block) bb->aux)
31838fd1498Szrj 	    {
31938fd1498Szrj 	      fprintf (dump_file, "%d [", bb->index);
32038fd1498Szrj 	      bb->count.dump (dump_file);
32138fd1498Szrj 	      fprintf (dump_file, "] ");
32238fd1498Szrj 	    }
32338fd1498Szrj 	  fprintf (dump_file, "%d [", bb->index);
32438fd1498Szrj 	  bb->count.dump (dump_file);
32538fd1498Szrj 	  fprintf (dump_file, "]\n");
32638fd1498Szrj 	}
32738fd1498Szrj       fflush (dump_file);
32838fd1498Szrj     }
32938fd1498Szrj }
33038fd1498Szrj 
33138fd1498Szrj /* Rotate loop whose back edge is BACK_EDGE in the tail of trace TRACE
33238fd1498Szrj    (with sequential number TRACE_N).  */
33338fd1498Szrj 
33438fd1498Szrj static basic_block
rotate_loop(edge back_edge,struct trace * trace,int trace_n)33538fd1498Szrj rotate_loop (edge back_edge, struct trace *trace, int trace_n)
33638fd1498Szrj {
33738fd1498Szrj   basic_block bb;
33838fd1498Szrj 
33938fd1498Szrj   /* Information about the best end (end after rotation) of the loop.  */
34038fd1498Szrj   basic_block best_bb = NULL;
34138fd1498Szrj   edge best_edge = NULL;
34238fd1498Szrj   profile_count best_count = profile_count::uninitialized ();
34338fd1498Szrj   /* The best edge is preferred when its destination is not visited yet
34438fd1498Szrj      or is a start block of some trace.  */
34538fd1498Szrj   bool is_preferred = false;
34638fd1498Szrj 
34738fd1498Szrj   /* Find the most frequent edge that goes out from current trace.  */
34838fd1498Szrj   bb = back_edge->dest;
34938fd1498Szrj   do
35038fd1498Szrj     {
35138fd1498Szrj       edge e;
35238fd1498Szrj       edge_iterator ei;
35338fd1498Szrj 
35438fd1498Szrj       FOR_EACH_EDGE (e, ei, bb->succs)
35538fd1498Szrj 	if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
35638fd1498Szrj 	    && bb_visited_trace (e->dest) != trace_n
35738fd1498Szrj 	    && (e->flags & EDGE_CAN_FALLTHRU)
35838fd1498Szrj 	    && !(e->flags & EDGE_COMPLEX))
35938fd1498Szrj 	{
36038fd1498Szrj 	  if (is_preferred)
36138fd1498Szrj 	    {
36238fd1498Szrj 	      /* The best edge is preferred.  */
36338fd1498Szrj 	      if (!bb_visited_trace (e->dest)
36438fd1498Szrj 		  || bbd[e->dest->index].start_of_trace >= 0)
36538fd1498Szrj 		{
36638fd1498Szrj 		  /* The current edge E is also preferred.  */
36738fd1498Szrj 		  if (e->count () > best_count)
36838fd1498Szrj 		    {
36938fd1498Szrj 		      best_count = e->count ();
37038fd1498Szrj 		      best_edge = e;
37138fd1498Szrj 		      best_bb = bb;
37238fd1498Szrj 		    }
37338fd1498Szrj 		}
37438fd1498Szrj 	    }
37538fd1498Szrj 	  else
37638fd1498Szrj 	    {
37738fd1498Szrj 	      if (!bb_visited_trace (e->dest)
37838fd1498Szrj 		  || bbd[e->dest->index].start_of_trace >= 0)
37938fd1498Szrj 		{
38038fd1498Szrj 		  /* The current edge E is preferred.  */
38138fd1498Szrj 		  is_preferred = true;
38238fd1498Szrj 		  best_count = e->count ();
38338fd1498Szrj 		  best_edge = e;
38438fd1498Szrj 		  best_bb = bb;
38538fd1498Szrj 		}
38638fd1498Szrj 	      else
38738fd1498Szrj 		{
38838fd1498Szrj 		  if (!best_edge || e->count () > best_count)
38938fd1498Szrj 		    {
39038fd1498Szrj 		      best_count = e->count ();
39138fd1498Szrj 		      best_edge = e;
39238fd1498Szrj 		      best_bb = bb;
39338fd1498Szrj 		    }
39438fd1498Szrj 		}
39538fd1498Szrj 	    }
39638fd1498Szrj 	}
39738fd1498Szrj       bb = (basic_block) bb->aux;
39838fd1498Szrj     }
39938fd1498Szrj   while (bb != back_edge->dest);
40038fd1498Szrj 
40138fd1498Szrj   if (best_bb)
40238fd1498Szrj     {
40338fd1498Szrj       /* Rotate the loop so that the BEST_EDGE goes out from the last block of
40438fd1498Szrj 	 the trace.  */
40538fd1498Szrj       if (back_edge->dest == trace->first)
40638fd1498Szrj 	{
40738fd1498Szrj 	  trace->first = (basic_block) best_bb->aux;
40838fd1498Szrj 	}
40938fd1498Szrj       else
41038fd1498Szrj 	{
41138fd1498Szrj 	  basic_block prev_bb;
41238fd1498Szrj 
41338fd1498Szrj 	  for (prev_bb = trace->first;
41438fd1498Szrj 	       prev_bb->aux != back_edge->dest;
41538fd1498Szrj 	       prev_bb = (basic_block) prev_bb->aux)
41638fd1498Szrj 	    ;
41738fd1498Szrj 	  prev_bb->aux = best_bb->aux;
41838fd1498Szrj 
41938fd1498Szrj 	  /* Try to get rid of uncond jump to cond jump.  */
42038fd1498Szrj 	  if (single_succ_p (prev_bb))
42138fd1498Szrj 	    {
42238fd1498Szrj 	      basic_block header = single_succ (prev_bb);
42338fd1498Szrj 
42438fd1498Szrj 	      /* Duplicate HEADER if it is a small block containing cond jump
42538fd1498Szrj 		 in the end.  */
42638fd1498Szrj 	      if (any_condjump_p (BB_END (header)) && copy_bb_p (header, 0)
42738fd1498Szrj 		  && !CROSSING_JUMP_P (BB_END (header)))
42838fd1498Szrj 		copy_bb (header, single_succ_edge (prev_bb), prev_bb, trace_n);
42938fd1498Szrj 	    }
43038fd1498Szrj 	}
43138fd1498Szrj     }
43238fd1498Szrj   else
43338fd1498Szrj     {
43438fd1498Szrj       /* We have not found suitable loop tail so do no rotation.  */
43538fd1498Szrj       best_bb = back_edge->src;
43638fd1498Szrj     }
43738fd1498Szrj   best_bb->aux = NULL;
43838fd1498Szrj   return best_bb;
43938fd1498Szrj }
44038fd1498Szrj 
44138fd1498Szrj /* One round of finding traces.  Find traces for BRANCH_TH and EXEC_TH i.e. do
44238fd1498Szrj    not include basic blocks whose probability is lower than BRANCH_TH or whose
44338fd1498Szrj    count is lower than EXEC_TH into traces (or whose count is lower than
44438fd1498Szrj    COUNT_TH).  Store the new traces into TRACES and modify the number of
44538fd1498Szrj    traces *N_TRACES.  Set the round (which the trace belongs to) to ROUND.
44638fd1498Szrj    The function expects starting basic blocks to be in *HEAP and will delete
44738fd1498Szrj    *HEAP and store starting points for the next round into new *HEAP.  */
44838fd1498Szrj 
44938fd1498Szrj static void
find_traces_1_round(int branch_th,profile_count count_th,struct trace * traces,int * n_traces,int round,bb_heap_t ** heap,int number_of_rounds)45038fd1498Szrj find_traces_1_round (int branch_th, profile_count count_th,
45138fd1498Szrj 		     struct trace *traces, int *n_traces, int round,
45238fd1498Szrj 		     bb_heap_t **heap, int number_of_rounds)
45338fd1498Szrj {
45438fd1498Szrj   /* Heap for discarded basic blocks which are possible starting points for
45538fd1498Szrj      the next round.  */
45638fd1498Szrj   bb_heap_t *new_heap = new bb_heap_t (LONG_MIN);
45738fd1498Szrj   bool for_size = optimize_function_for_size_p (cfun);
45838fd1498Szrj 
45938fd1498Szrj   while (!(*heap)->empty ())
46038fd1498Szrj     {
46138fd1498Szrj       basic_block bb;
46238fd1498Szrj       struct trace *trace;
46338fd1498Szrj       edge best_edge, e;
46438fd1498Szrj       long key;
46538fd1498Szrj       edge_iterator ei;
46638fd1498Szrj 
46738fd1498Szrj       bb = (*heap)->extract_min ();
46838fd1498Szrj       bbd[bb->index].heap = NULL;
46938fd1498Szrj       bbd[bb->index].node = NULL;
47038fd1498Szrj 
47138fd1498Szrj       if (dump_file)
47238fd1498Szrj 	fprintf (dump_file, "Getting bb %d\n", bb->index);
47338fd1498Szrj 
47438fd1498Szrj       /* If the BB's count is too low, send BB to the next round.  When
47538fd1498Szrj 	 partitioning hot/cold blocks into separate sections, make sure all
47638fd1498Szrj 	 the cold blocks (and ONLY the cold blocks) go into the (extra) final
47738fd1498Szrj 	 round.  When optimizing for size, do not push to next round.  */
47838fd1498Szrj 
47938fd1498Szrj       if (!for_size
48038fd1498Szrj 	  && push_to_next_round_p (bb, round, number_of_rounds,
48138fd1498Szrj 				   count_th))
48238fd1498Szrj 	{
48338fd1498Szrj 	  int key = bb_to_key (bb);
48438fd1498Szrj 	  bbd[bb->index].heap = new_heap;
48538fd1498Szrj 	  bbd[bb->index].node = new_heap->insert (key, bb);
48638fd1498Szrj 
48738fd1498Szrj 	  if (dump_file)
48838fd1498Szrj 	    fprintf (dump_file,
48938fd1498Szrj 		     "  Possible start point of next round: %d (key: %d)\n",
49038fd1498Szrj 		     bb->index, key);
49138fd1498Szrj 	  continue;
49238fd1498Szrj 	}
49338fd1498Szrj 
49438fd1498Szrj       trace = traces + *n_traces;
49538fd1498Szrj       trace->first = bb;
49638fd1498Szrj       trace->round = round;
49738fd1498Szrj       trace->length = 0;
49838fd1498Szrj       bbd[bb->index].in_trace = *n_traces;
49938fd1498Szrj       (*n_traces)++;
50038fd1498Szrj 
50138fd1498Szrj       do
50238fd1498Szrj 	{
50338fd1498Szrj 	  bool ends_in_call;
50438fd1498Szrj 
50538fd1498Szrj 	  /* The probability and count of the best edge.  */
50638fd1498Szrj 	  profile_probability best_prob = profile_probability::uninitialized ();
50738fd1498Szrj 	  profile_count best_count = profile_count::uninitialized ();
50838fd1498Szrj 
50938fd1498Szrj 	  best_edge = NULL;
51038fd1498Szrj 	  mark_bb_visited (bb, *n_traces);
51138fd1498Szrj 	  trace->length++;
51238fd1498Szrj 
51338fd1498Szrj 	  if (dump_file)
51438fd1498Szrj 	    fprintf (dump_file, "Basic block %d was visited in trace %d\n",
51538fd1498Szrj 		     bb->index, *n_traces);
51638fd1498Szrj 
51738fd1498Szrj 	  ends_in_call = block_ends_with_call_p (bb);
51838fd1498Szrj 
51938fd1498Szrj 	  /* Select the successor that will be placed after BB.  */
52038fd1498Szrj 	  FOR_EACH_EDGE (e, ei, bb->succs)
52138fd1498Szrj 	    {
52238fd1498Szrj 	      gcc_assert (!(e->flags & EDGE_FAKE));
52338fd1498Szrj 
52438fd1498Szrj 	      if (e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
52538fd1498Szrj 		continue;
52638fd1498Szrj 
52738fd1498Szrj 	      if (bb_visited_trace (e->dest)
52838fd1498Szrj 		  && bb_visited_trace (e->dest) != *n_traces)
52938fd1498Szrj 		continue;
53038fd1498Szrj 
53138fd1498Szrj 	      /* If partitioning hot/cold basic blocks, don't consider edges
53238fd1498Szrj 		 that cross section boundaries.  */
53338fd1498Szrj 	      if (BB_PARTITION (e->dest) != BB_PARTITION (bb))
53438fd1498Szrj 		continue;
53538fd1498Szrj 
53638fd1498Szrj 	      profile_probability prob = e->probability;
53738fd1498Szrj 	      profile_count count = e->dest->count;
53838fd1498Szrj 
53938fd1498Szrj 	      /* The only sensible preference for a call instruction is the
54038fd1498Szrj 		 fallthru edge.  Don't bother selecting anything else.  */
54138fd1498Szrj 	      if (ends_in_call)
54238fd1498Szrj 		{
54338fd1498Szrj 		  if (e->flags & EDGE_CAN_FALLTHRU)
54438fd1498Szrj 		    {
54538fd1498Szrj 		      best_edge = e;
54638fd1498Szrj 		      best_prob = prob;
54738fd1498Szrj 		      best_count = count;
54838fd1498Szrj 		    }
54938fd1498Szrj 		  continue;
55038fd1498Szrj 		}
55138fd1498Szrj 
55238fd1498Szrj 	      /* Edge that cannot be fallthru or improbable or infrequent
55338fd1498Szrj 		 successor (i.e. it is unsuitable successor).  When optimizing
55438fd1498Szrj 		 for size, ignore the probability and count.  */
55538fd1498Szrj 	      if (!(e->flags & EDGE_CAN_FALLTHRU) || (e->flags & EDGE_COMPLEX)
55638fd1498Szrj 		  || !prob.initialized_p ()
55738fd1498Szrj 		  || ((prob.to_reg_br_prob_base () < branch_th
55838fd1498Szrj 		      || e->count () < count_th) && (!for_size)))
55938fd1498Szrj 		continue;
56038fd1498Szrj 
56138fd1498Szrj 	      if (better_edge_p (bb, e, prob, count, best_prob, best_count,
56238fd1498Szrj 				 best_edge))
56338fd1498Szrj 		{
56438fd1498Szrj 		  best_edge = e;
56538fd1498Szrj 		  best_prob = prob;
56638fd1498Szrj 		  best_count = count;
56738fd1498Szrj 		}
56838fd1498Szrj 	    }
56938fd1498Szrj 
57038fd1498Szrj 	  /* If the best destination has multiple predecessors and can be
57138fd1498Szrj 	     duplicated cheaper than a jump, don't allow it to be added to
57238fd1498Szrj 	     a trace; we'll duplicate it when connecting the traces later.
57338fd1498Szrj 	     However, we need to check that this duplication wouldn't leave
57438fd1498Szrj 	     the best destination with only crossing predecessors, because
57538fd1498Szrj 	     this would change its effective partition from hot to cold.  */
57638fd1498Szrj 	  if (best_edge
57738fd1498Szrj 	      && EDGE_COUNT (best_edge->dest->preds) >= 2
57838fd1498Szrj 	      && copy_bb_p (best_edge->dest, 0))
57938fd1498Szrj 	    {
58038fd1498Szrj 	      bool only_crossing_preds = true;
58138fd1498Szrj 	      edge e;
58238fd1498Szrj 	      edge_iterator ei;
58338fd1498Szrj 	      FOR_EACH_EDGE (e, ei, best_edge->dest->preds)
58438fd1498Szrj 		if (e != best_edge && !(e->flags & EDGE_CROSSING))
58538fd1498Szrj 		  {
58638fd1498Szrj 		    only_crossing_preds = false;
58738fd1498Szrj 		    break;
58838fd1498Szrj 		  }
58938fd1498Szrj 	      if (!only_crossing_preds)
59038fd1498Szrj 		best_edge = NULL;
59138fd1498Szrj 	    }
59238fd1498Szrj 
59338fd1498Szrj 	  /* If the best destination has multiple successors or predecessors,
59438fd1498Szrj 	     don't allow it to be added when optimizing for size.  This makes
59538fd1498Szrj 	     sure predecessors with smaller index are handled before the best
59638fd1498Szrj 	     destinarion.  It breaks long trace and reduces long jumps.
59738fd1498Szrj 
59838fd1498Szrj 	     Take if-then-else as an example.
59938fd1498Szrj 		A
60038fd1498Szrj 	       / \
60138fd1498Szrj 	      B   C
60238fd1498Szrj 	       \ /
60338fd1498Szrj 		D
60438fd1498Szrj 	     If we do not remove the best edge B->D/C->D, the final order might
60538fd1498Szrj 	     be A B D ... C.  C is at the end of the program.  If D's successors
60638fd1498Szrj 	     and D are complicated, might need long jumps for A->C and C->D.
60738fd1498Szrj 	     Similar issue for order: A C D ... B.
60838fd1498Szrj 
60938fd1498Szrj 	     After removing the best edge, the final result will be ABCD/ ACBD.
61038fd1498Szrj 	     It does not add jump compared with the previous order.  But it
61138fd1498Szrj 	     reduces the possibility of long jumps.  */
61238fd1498Szrj 	  if (best_edge && for_size
61338fd1498Szrj 	      && (EDGE_COUNT (best_edge->dest->succs) > 1
61438fd1498Szrj 		 || EDGE_COUNT (best_edge->dest->preds) > 1))
61538fd1498Szrj 	    best_edge = NULL;
61638fd1498Szrj 
61738fd1498Szrj 	  /* Add all non-selected successors to the heaps.  */
61838fd1498Szrj 	  FOR_EACH_EDGE (e, ei, bb->succs)
61938fd1498Szrj 	    {
62038fd1498Szrj 	      if (e == best_edge
62138fd1498Szrj 		  || e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
62238fd1498Szrj 		  || bb_visited_trace (e->dest))
62338fd1498Szrj 		continue;
62438fd1498Szrj 
62538fd1498Szrj 	      key = bb_to_key (e->dest);
62638fd1498Szrj 
62738fd1498Szrj 	      if (bbd[e->dest->index].heap)
62838fd1498Szrj 		{
62938fd1498Szrj 		  /* E->DEST is already in some heap.  */
63038fd1498Szrj 		  if (key != bbd[e->dest->index].node->get_key ())
63138fd1498Szrj 		    {
63238fd1498Szrj 		      if (dump_file)
63338fd1498Szrj 			{
63438fd1498Szrj 			  fprintf (dump_file,
63538fd1498Szrj 				   "Changing key for bb %d from %ld to %ld.\n",
63638fd1498Szrj 				   e->dest->index,
63738fd1498Szrj 				   (long) bbd[e->dest->index].node->get_key (),
63838fd1498Szrj 				   key);
63938fd1498Szrj 			}
64038fd1498Szrj 		      bbd[e->dest->index].heap->replace_key
64138fd1498Szrj 		        (bbd[e->dest->index].node, key);
64238fd1498Szrj 		    }
64338fd1498Szrj 		}
64438fd1498Szrj 	      else
64538fd1498Szrj 		{
64638fd1498Szrj 		  bb_heap_t *which_heap = *heap;
64738fd1498Szrj 
64838fd1498Szrj 		  profile_probability prob = e->probability;
64938fd1498Szrj 
65038fd1498Szrj 		  if (!(e->flags & EDGE_CAN_FALLTHRU)
65138fd1498Szrj 		      || (e->flags & EDGE_COMPLEX)
65238fd1498Szrj 		      || !prob.initialized_p ()
65338fd1498Szrj 		      || prob.to_reg_br_prob_base () < branch_th
65438fd1498Szrj 		      || e->count () < count_th)
65538fd1498Szrj 		    {
65638fd1498Szrj 		      /* When partitioning hot/cold basic blocks, make sure
65738fd1498Szrj 			 the cold blocks (and only the cold blocks) all get
65838fd1498Szrj 			 pushed to the last round of trace collection.  When
65938fd1498Szrj 			 optimizing for size, do not push to next round.  */
66038fd1498Szrj 
66138fd1498Szrj 		      if (!for_size && push_to_next_round_p (e->dest, round,
66238fd1498Szrj 							     number_of_rounds,
66338fd1498Szrj 							     count_th))
66438fd1498Szrj 			which_heap = new_heap;
66538fd1498Szrj 		    }
66638fd1498Szrj 
66738fd1498Szrj 		  bbd[e->dest->index].heap = which_heap;
66838fd1498Szrj 		  bbd[e->dest->index].node = which_heap->insert (key, e->dest);
66938fd1498Szrj 
67038fd1498Szrj 		  if (dump_file)
67138fd1498Szrj 		    {
67238fd1498Szrj 		      fprintf (dump_file,
67338fd1498Szrj 			       "  Possible start of %s round: %d (key: %ld)\n",
67438fd1498Szrj 			       (which_heap == new_heap) ? "next" : "this",
67538fd1498Szrj 			       e->dest->index, (long) key);
67638fd1498Szrj 		    }
67738fd1498Szrj 
67838fd1498Szrj 		}
67938fd1498Szrj 	    }
68038fd1498Szrj 
68138fd1498Szrj 	  if (best_edge) /* Suitable successor was found.  */
68238fd1498Szrj 	    {
68338fd1498Szrj 	      if (bb_visited_trace (best_edge->dest) == *n_traces)
68438fd1498Szrj 		{
68538fd1498Szrj 		  /* We do nothing with one basic block loops.  */
68638fd1498Szrj 		  if (best_edge->dest != bb)
68738fd1498Szrj 		    {
68838fd1498Szrj 		      if (best_edge->count ()
68938fd1498Szrj 			  > best_edge->dest->count.apply_scale (4, 5))
69038fd1498Szrj 			{
69138fd1498Szrj 			  /* The loop has at least 4 iterations.  If the loop
69238fd1498Szrj 			     header is not the first block of the function
69338fd1498Szrj 			     we can rotate the loop.  */
69438fd1498Szrj 
69538fd1498Szrj 			  if (best_edge->dest
69638fd1498Szrj 			      != ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb)
69738fd1498Szrj 			    {
69838fd1498Szrj 			      if (dump_file)
69938fd1498Szrj 				{
70038fd1498Szrj 				  fprintf (dump_file,
70138fd1498Szrj 					   "Rotating loop %d - %d\n",
70238fd1498Szrj 					   best_edge->dest->index, bb->index);
70338fd1498Szrj 				}
70438fd1498Szrj 			      bb->aux = best_edge->dest;
70538fd1498Szrj 			      bbd[best_edge->dest->index].in_trace =
70638fd1498Szrj 							     (*n_traces) - 1;
70738fd1498Szrj 			      bb = rotate_loop (best_edge, trace, *n_traces);
70838fd1498Szrj 			    }
70938fd1498Szrj 			}
71038fd1498Szrj 		      else
71138fd1498Szrj 			{
71238fd1498Szrj 			  /* The loop has less than 4 iterations.  */
71338fd1498Szrj 
71438fd1498Szrj 			  if (single_succ_p (bb)
71538fd1498Szrj 			      && copy_bb_p (best_edge->dest,
71638fd1498Szrj 			      		    optimize_edge_for_speed_p
71738fd1498Szrj 			      		    (best_edge)))
71838fd1498Szrj 			    {
71938fd1498Szrj 			      bb = copy_bb (best_edge->dest, best_edge, bb,
72038fd1498Szrj 					    *n_traces);
72138fd1498Szrj 			      trace->length++;
72238fd1498Szrj 			    }
72338fd1498Szrj 			}
72438fd1498Szrj 		    }
72538fd1498Szrj 
72638fd1498Szrj 		  /* Terminate the trace.  */
72738fd1498Szrj 		  break;
72838fd1498Szrj 		}
72938fd1498Szrj 	      else
73038fd1498Szrj 		{
73138fd1498Szrj 		  /* Check for a situation
73238fd1498Szrj 
73338fd1498Szrj 		    A
73438fd1498Szrj 		   /|
73538fd1498Szrj 		  B |
73638fd1498Szrj 		   \|
73738fd1498Szrj 		    C
73838fd1498Szrj 
73938fd1498Szrj 		  where
74038fd1498Szrj 		  AB->count () + BC->count () >= AC->count ().
74138fd1498Szrj 		  (i.e. 2 * B->count >= AC->count )
74238fd1498Szrj 		  Best ordering is then A B C.
74338fd1498Szrj 
74438fd1498Szrj 		  When optimizing for size, A B C is always the best order.
74538fd1498Szrj 
74638fd1498Szrj 		  This situation is created for example by:
74738fd1498Szrj 
74838fd1498Szrj 		  if (A) B;
74938fd1498Szrj 		  C;
75038fd1498Szrj 
75138fd1498Szrj 		  */
75238fd1498Szrj 
75338fd1498Szrj 		  FOR_EACH_EDGE (e, ei, bb->succs)
75438fd1498Szrj 		    if (e != best_edge
75538fd1498Szrj 			&& (e->flags & EDGE_CAN_FALLTHRU)
75638fd1498Szrj 			&& !(e->flags & EDGE_COMPLEX)
75738fd1498Szrj 			&& !bb_visited_trace (e->dest)
75838fd1498Szrj 			&& single_pred_p (e->dest)
75938fd1498Szrj 			&& !(e->flags & EDGE_CROSSING)
76038fd1498Szrj 			&& single_succ_p (e->dest)
76138fd1498Szrj 			&& (single_succ_edge (e->dest)->flags
76238fd1498Szrj 			    & EDGE_CAN_FALLTHRU)
76338fd1498Szrj 			&& !(single_succ_edge (e->dest)->flags & EDGE_COMPLEX)
76438fd1498Szrj 			&& single_succ (e->dest) == best_edge->dest
76538fd1498Szrj 			&& (e->dest->count.apply_scale (2, 1)
76638fd1498Szrj 			    >= best_edge->count () || for_size))
76738fd1498Szrj 		      {
76838fd1498Szrj 			best_edge = e;
76938fd1498Szrj 			if (dump_file)
77038fd1498Szrj 			  fprintf (dump_file, "Selecting BB %d\n",
77138fd1498Szrj 				   best_edge->dest->index);
77238fd1498Szrj 			break;
77338fd1498Szrj 		      }
77438fd1498Szrj 
77538fd1498Szrj 		  bb->aux = best_edge->dest;
77638fd1498Szrj 		  bbd[best_edge->dest->index].in_trace = (*n_traces) - 1;
77738fd1498Szrj 		  bb = best_edge->dest;
77838fd1498Szrj 		}
77938fd1498Szrj 	    }
78038fd1498Szrj 	}
78138fd1498Szrj       while (best_edge);
78238fd1498Szrj       trace->last = bb;
78338fd1498Szrj       bbd[trace->first->index].start_of_trace = *n_traces - 1;
78438fd1498Szrj       if (bbd[trace->last->index].end_of_trace != *n_traces - 1)
78538fd1498Szrj 	{
78638fd1498Szrj 	  bbd[trace->last->index].end_of_trace = *n_traces - 1;
78738fd1498Szrj 	  /* Update the cached maximum frequency for interesting predecessor
78838fd1498Szrj 	     edges for successors of the new trace end.  */
78938fd1498Szrj 	  FOR_EACH_EDGE (e, ei, trace->last->succs)
79038fd1498Szrj 	    if (EDGE_FREQUENCY (e) > bbd[e->dest->index].priority)
79138fd1498Szrj 	      bbd[e->dest->index].priority = EDGE_FREQUENCY (e);
79238fd1498Szrj 	}
79338fd1498Szrj 
79438fd1498Szrj       /* The trace is terminated so we have to recount the keys in heap
79538fd1498Szrj 	 (some block can have a lower key because now one of its predecessors
79638fd1498Szrj 	 is an end of the trace).  */
79738fd1498Szrj       FOR_EACH_EDGE (e, ei, bb->succs)
79838fd1498Szrj 	{
79938fd1498Szrj 	  if (e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
80038fd1498Szrj 	      || bb_visited_trace (e->dest))
80138fd1498Szrj 	    continue;
80238fd1498Szrj 
80338fd1498Szrj 	  if (bbd[e->dest->index].heap)
80438fd1498Szrj 	    {
80538fd1498Szrj 	      key = bb_to_key (e->dest);
80638fd1498Szrj 	      if (key != bbd[e->dest->index].node->get_key ())
80738fd1498Szrj 		{
80838fd1498Szrj 		  if (dump_file)
80938fd1498Szrj 		    {
81038fd1498Szrj 		      fprintf (dump_file,
81138fd1498Szrj 			       "Changing key for bb %d from %ld to %ld.\n",
81238fd1498Szrj 			       e->dest->index,
81338fd1498Szrj 			       (long) bbd[e->dest->index].node->get_key (), key);
81438fd1498Szrj 		    }
81538fd1498Szrj 		  bbd[e->dest->index].heap->replace_key
81638fd1498Szrj 		    (bbd[e->dest->index].node, key);
81738fd1498Szrj 		}
81838fd1498Szrj 	    }
81938fd1498Szrj 	}
82038fd1498Szrj     }
82138fd1498Szrj 
82238fd1498Szrj   delete (*heap);
82338fd1498Szrj 
82438fd1498Szrj   /* "Return" the new heap.  */
82538fd1498Szrj   *heap = new_heap;
82638fd1498Szrj }
82738fd1498Szrj 
82838fd1498Szrj /* Create a duplicate of the basic block OLD_BB and redirect edge E to it, add
82938fd1498Szrj    it to trace after BB, mark OLD_BB visited and update pass' data structures
83038fd1498Szrj    (TRACE is a number of trace which OLD_BB is duplicated to).  */
83138fd1498Szrj 
83238fd1498Szrj static basic_block
copy_bb(basic_block old_bb,edge e,basic_block bb,int trace)83338fd1498Szrj copy_bb (basic_block old_bb, edge e, basic_block bb, int trace)
83438fd1498Szrj {
83538fd1498Szrj   basic_block new_bb;
83638fd1498Szrj 
83738fd1498Szrj   new_bb = duplicate_block (old_bb, e, bb);
83838fd1498Szrj   BB_COPY_PARTITION (new_bb, old_bb);
83938fd1498Szrj 
84038fd1498Szrj   gcc_assert (e->dest == new_bb);
84138fd1498Szrj 
84238fd1498Szrj   if (dump_file)
84338fd1498Szrj     fprintf (dump_file,
84438fd1498Szrj 	     "Duplicated bb %d (created bb %d)\n",
84538fd1498Szrj 	     old_bb->index, new_bb->index);
84638fd1498Szrj 
84738fd1498Szrj   if (new_bb->index >= array_size
84838fd1498Szrj       || last_basic_block_for_fn (cfun) > array_size)
84938fd1498Szrj     {
85038fd1498Szrj       int i;
85138fd1498Szrj       int new_size;
85238fd1498Szrj 
85338fd1498Szrj       new_size = MAX (last_basic_block_for_fn (cfun), new_bb->index + 1);
85438fd1498Szrj       new_size = GET_ARRAY_SIZE (new_size);
85538fd1498Szrj       bbd = XRESIZEVEC (bbro_basic_block_data, bbd, new_size);
85638fd1498Szrj       for (i = array_size; i < new_size; i++)
85738fd1498Szrj 	{
85838fd1498Szrj 	  bbd[i].start_of_trace = -1;
85938fd1498Szrj 	  bbd[i].end_of_trace = -1;
86038fd1498Szrj 	  bbd[i].in_trace = -1;
86138fd1498Szrj 	  bbd[i].visited = 0;
86238fd1498Szrj 	  bbd[i].priority = -1;
86338fd1498Szrj 	  bbd[i].heap = NULL;
86438fd1498Szrj 	  bbd[i].node = NULL;
86538fd1498Szrj 	}
86638fd1498Szrj       array_size = new_size;
86738fd1498Szrj 
86838fd1498Szrj       if (dump_file)
86938fd1498Szrj 	{
87038fd1498Szrj 	  fprintf (dump_file,
87138fd1498Szrj 		   "Growing the dynamic array to %d elements.\n",
87238fd1498Szrj 		   array_size);
87338fd1498Szrj 	}
87438fd1498Szrj     }
87538fd1498Szrj 
87638fd1498Szrj   gcc_assert (!bb_visited_trace (e->dest));
87738fd1498Szrj   mark_bb_visited (new_bb, trace);
87838fd1498Szrj   new_bb->aux = bb->aux;
87938fd1498Szrj   bb->aux = new_bb;
88038fd1498Szrj 
88138fd1498Szrj   bbd[new_bb->index].in_trace = trace;
88238fd1498Szrj 
88338fd1498Szrj   return new_bb;
88438fd1498Szrj }
88538fd1498Szrj 
88638fd1498Szrj /* Compute and return the key (for the heap) of the basic block BB.  */
88738fd1498Szrj 
88838fd1498Szrj static long
bb_to_key(basic_block bb)88938fd1498Szrj bb_to_key (basic_block bb)
89038fd1498Szrj {
89138fd1498Szrj   edge e;
89238fd1498Szrj   edge_iterator ei;
89338fd1498Szrj 
89438fd1498Szrj   /* Use index as key to align with its original order.  */
89538fd1498Szrj   if (optimize_function_for_size_p (cfun))
89638fd1498Szrj     return bb->index;
89738fd1498Szrj 
89838fd1498Szrj   /* Do not start in probably never executed blocks.  */
89938fd1498Szrj 
90038fd1498Szrj   if (BB_PARTITION (bb) == BB_COLD_PARTITION
90138fd1498Szrj       || probably_never_executed_bb_p (cfun, bb))
90238fd1498Szrj     return BB_FREQ_MAX;
90338fd1498Szrj 
90438fd1498Szrj   /* Prefer blocks whose predecessor is an end of some trace
90538fd1498Szrj      or whose predecessor edge is EDGE_DFS_BACK.  */
90638fd1498Szrj   int priority = bbd[bb->index].priority;
90738fd1498Szrj   if (priority == -1)
90838fd1498Szrj     {
90938fd1498Szrj       priority = 0;
91038fd1498Szrj       FOR_EACH_EDGE (e, ei, bb->preds)
91138fd1498Szrj 	{
91238fd1498Szrj 	  if ((e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)
91338fd1498Szrj 	       && bbd[e->src->index].end_of_trace >= 0)
91438fd1498Szrj 	      || (e->flags & EDGE_DFS_BACK))
91538fd1498Szrj 	    {
91638fd1498Szrj 	      int edge_freq = EDGE_FREQUENCY (e);
91738fd1498Szrj 
91838fd1498Szrj 	      if (edge_freq > priority)
91938fd1498Szrj 		priority = edge_freq;
92038fd1498Szrj 	    }
92138fd1498Szrj 	}
92238fd1498Szrj       bbd[bb->index].priority = priority;
92338fd1498Szrj     }
92438fd1498Szrj 
92538fd1498Szrj   if (priority)
92638fd1498Szrj     /* The block with priority should have significantly lower key.  */
92738fd1498Szrj     return -(100 * BB_FREQ_MAX + 100 * priority + bb->count.to_frequency (cfun));
92838fd1498Szrj 
92938fd1498Szrj   return -bb->count.to_frequency (cfun);
93038fd1498Szrj }
93138fd1498Szrj 
93238fd1498Szrj /* Return true when the edge E from basic block BB is better than the temporary
93338fd1498Szrj    best edge (details are in function).  The probability of edge E is PROB. The
93438fd1498Szrj    count of the successor is COUNT.  The current best probability is
93538fd1498Szrj    BEST_PROB, the best count is BEST_COUNT.
93638fd1498Szrj    The edge is considered to be equivalent when PROB does not differ much from
93738fd1498Szrj    BEST_PROB; similarly for count.  */
93838fd1498Szrj 
93938fd1498Szrj static bool
better_edge_p(const_basic_block bb,const_edge e,profile_probability prob,profile_count count,profile_probability best_prob,profile_count best_count,const_edge cur_best_edge)94038fd1498Szrj better_edge_p (const_basic_block bb, const_edge e, profile_probability prob,
94138fd1498Szrj 	       profile_count count, profile_probability best_prob,
94238fd1498Szrj 	       profile_count best_count, const_edge cur_best_edge)
94338fd1498Szrj {
94438fd1498Szrj   bool is_better_edge;
94538fd1498Szrj 
94638fd1498Szrj   /* The BEST_* values do not have to be best, but can be a bit smaller than
94738fd1498Szrj      maximum values.  */
94838fd1498Szrj   profile_probability diff_prob = best_prob.apply_scale (1, 10);
94938fd1498Szrj 
95038fd1498Szrj   /* The smaller one is better to keep the original order.  */
95138fd1498Szrj   if (optimize_function_for_size_p (cfun))
95238fd1498Szrj     return !cur_best_edge
95338fd1498Szrj 	   || cur_best_edge->dest->index > e->dest->index;
95438fd1498Szrj 
95538fd1498Szrj   /* Those edges are so expensive that continuing a trace is not useful
95638fd1498Szrj      performance wise.  */
95738fd1498Szrj   if (e->flags & (EDGE_ABNORMAL | EDGE_EH))
95838fd1498Szrj     return false;
95938fd1498Szrj 
96038fd1498Szrj   if (prob > best_prob + diff_prob
96138fd1498Szrj       || (!best_prob.initialized_p ()
96238fd1498Szrj 	  && prob > profile_probability::guessed_never ()))
96338fd1498Szrj     /* The edge has higher probability than the temporary best edge.  */
96438fd1498Szrj     is_better_edge = true;
96538fd1498Szrj   else if (prob < best_prob - diff_prob)
96638fd1498Szrj     /* The edge has lower probability than the temporary best edge.  */
96738fd1498Szrj     is_better_edge = false;
96838fd1498Szrj   else
96938fd1498Szrj     {
97038fd1498Szrj       profile_count diff_count = best_count.apply_scale (1, 10);
97138fd1498Szrj       if (count < best_count - diff_count
97238fd1498Szrj 	  || (!best_count.initialized_p ()
97338fd1498Szrj 	      && count.nonzero_p ()))
97438fd1498Szrj 	/* The edge and the temporary best edge  have almost equivalent
97538fd1498Szrj 	   probabilities.  The higher countuency of a successor now means
97638fd1498Szrj 	   that there is another edge going into that successor.
97738fd1498Szrj 	   This successor has lower countuency so it is better.  */
97838fd1498Szrj 	is_better_edge = true;
97938fd1498Szrj       else if (count > best_count + diff_count)
98038fd1498Szrj 	/* This successor has higher countuency so it is worse.  */
98138fd1498Szrj 	is_better_edge = false;
98238fd1498Szrj       else if (e->dest->prev_bb == bb)
98338fd1498Szrj 	/* The edges have equivalent probabilities and the successors
98438fd1498Szrj 	   have equivalent frequencies.  Select the previous successor.  */
98538fd1498Szrj 	is_better_edge = true;
98638fd1498Szrj       else
98738fd1498Szrj 	is_better_edge = false;
98838fd1498Szrj     }
98938fd1498Szrj 
99038fd1498Szrj   return is_better_edge;
99138fd1498Szrj }
99238fd1498Szrj 
99338fd1498Szrj /* Return true when the edge E is better than the temporary best edge
99438fd1498Szrj    CUR_BEST_EDGE.  If SRC_INDEX_P is true, the function compares the src bb of
99538fd1498Szrj    E and CUR_BEST_EDGE; otherwise it will compare the dest bb.
99638fd1498Szrj    BEST_LEN is the trace length of src (or dest) bb in CUR_BEST_EDGE.
99738fd1498Szrj    TRACES record the information about traces.
99838fd1498Szrj    When optimizing for size, the edge with smaller index is better.
99938fd1498Szrj    When optimizing for speed, the edge with bigger probability or longer trace
100038fd1498Szrj    is better.  */
100138fd1498Szrj 
100238fd1498Szrj static bool
connect_better_edge_p(const_edge e,bool src_index_p,int best_len,const_edge cur_best_edge,struct trace * traces)100338fd1498Szrj connect_better_edge_p (const_edge e, bool src_index_p, int best_len,
100438fd1498Szrj 		       const_edge cur_best_edge, struct trace *traces)
100538fd1498Szrj {
100638fd1498Szrj   int e_index;
100738fd1498Szrj   int b_index;
100838fd1498Szrj   bool is_better_edge;
100938fd1498Szrj 
101038fd1498Szrj   if (!cur_best_edge)
101138fd1498Szrj     return true;
101238fd1498Szrj 
101338fd1498Szrj   if (optimize_function_for_size_p (cfun))
101438fd1498Szrj     {
101538fd1498Szrj       e_index = src_index_p ? e->src->index : e->dest->index;
101638fd1498Szrj       b_index = src_index_p ? cur_best_edge->src->index
101738fd1498Szrj 			      : cur_best_edge->dest->index;
101838fd1498Szrj       /* The smaller one is better to keep the original order.  */
101938fd1498Szrj       return b_index > e_index;
102038fd1498Szrj     }
102138fd1498Szrj 
102238fd1498Szrj   if (src_index_p)
102338fd1498Szrj     {
102438fd1498Szrj       e_index = e->src->index;
102538fd1498Szrj 
102638fd1498Szrj       /* We are looking for predecessor, so probabilities are not that
102738fd1498Szrj 	 informative.  We do not want to connect A to B becuse A has
102838fd1498Szrj 	 only one sucessor (probablity is 100%) while there is edge
102938fd1498Szrj 	 A' to B where probability is 90% but which is much more frequent.  */
103038fd1498Szrj       if (e->count () > cur_best_edge->count ())
103138fd1498Szrj 	/* The edge has higher probability than the temporary best edge.  */
103238fd1498Szrj 	is_better_edge = true;
103338fd1498Szrj       else if (e->count () < cur_best_edge->count ())
103438fd1498Szrj 	/* The edge has lower probability than the temporary best edge.  */
103538fd1498Szrj 	is_better_edge = false;
103638fd1498Szrj       if (e->probability > cur_best_edge->probability)
103738fd1498Szrj 	/* The edge has higher probability than the temporary best edge.  */
103838fd1498Szrj 	is_better_edge = true;
103938fd1498Szrj       else if (e->probability < cur_best_edge->probability)
104038fd1498Szrj 	/* The edge has lower probability than the temporary best edge.  */
104138fd1498Szrj 	is_better_edge = false;
104238fd1498Szrj       else if (traces[bbd[e_index].end_of_trace].length > best_len)
104338fd1498Szrj 	/* The edge and the temporary best edge have equivalent probabilities.
104438fd1498Szrj 	   The edge with longer trace is better.  */
104538fd1498Szrj 	is_better_edge = true;
104638fd1498Szrj       else
104738fd1498Szrj 	is_better_edge = false;
104838fd1498Szrj     }
104938fd1498Szrj   else
105038fd1498Szrj     {
105138fd1498Szrj       e_index = e->dest->index;
105238fd1498Szrj 
105338fd1498Szrj       if (e->probability > cur_best_edge->probability)
105438fd1498Szrj 	/* The edge has higher probability than the temporary best edge.  */
105538fd1498Szrj 	is_better_edge = true;
105638fd1498Szrj       else if (e->probability < cur_best_edge->probability)
105738fd1498Szrj 	/* The edge has lower probability than the temporary best edge.  */
105838fd1498Szrj 	is_better_edge = false;
105938fd1498Szrj       else if (traces[bbd[e_index].start_of_trace].length > best_len)
106038fd1498Szrj 	/* The edge and the temporary best edge have equivalent probabilities.
106138fd1498Szrj 	   The edge with longer trace is better.  */
106238fd1498Szrj 	is_better_edge = true;
106338fd1498Szrj       else
106438fd1498Szrj 	is_better_edge = false;
106538fd1498Szrj     }
106638fd1498Szrj 
106738fd1498Szrj   return is_better_edge;
106838fd1498Szrj }
106938fd1498Szrj 
107038fd1498Szrj /* Connect traces in array TRACES, N_TRACES is the count of traces.  */
107138fd1498Szrj 
107238fd1498Szrj static void
connect_traces(int n_traces,struct trace * traces)107338fd1498Szrj connect_traces (int n_traces, struct trace *traces)
107438fd1498Szrj {
107538fd1498Szrj   int i;
107638fd1498Szrj   bool *connected;
107738fd1498Szrj   bool two_passes;
107838fd1498Szrj   int last_trace;
107938fd1498Szrj   int current_pass;
108038fd1498Szrj   int current_partition;
108138fd1498Szrj   profile_count count_threshold;
108238fd1498Szrj   bool for_size = optimize_function_for_size_p (cfun);
108338fd1498Szrj 
108438fd1498Szrj   count_threshold = max_entry_count.apply_scale (DUPLICATION_THRESHOLD, 1000);
108538fd1498Szrj 
108638fd1498Szrj   connected = XCNEWVEC (bool, n_traces);
108738fd1498Szrj   last_trace = -1;
108838fd1498Szrj   current_pass = 1;
108938fd1498Szrj   current_partition = BB_PARTITION (traces[0].first);
109038fd1498Szrj   two_passes = false;
109138fd1498Szrj 
109238fd1498Szrj   if (crtl->has_bb_partition)
109338fd1498Szrj     for (i = 0; i < n_traces && !two_passes; i++)
109438fd1498Szrj       if (BB_PARTITION (traces[0].first)
109538fd1498Szrj 	  != BB_PARTITION (traces[i].first))
109638fd1498Szrj 	two_passes = true;
109738fd1498Szrj 
109838fd1498Szrj   for (i = 0; i < n_traces || (two_passes && current_pass == 1) ; i++)
109938fd1498Szrj     {
110038fd1498Szrj       int t = i;
110138fd1498Szrj       int t2;
110238fd1498Szrj       edge e, best;
110338fd1498Szrj       int best_len;
110438fd1498Szrj 
110538fd1498Szrj       if (i >= n_traces)
110638fd1498Szrj 	{
110738fd1498Szrj 	  gcc_assert (two_passes && current_pass == 1);
110838fd1498Szrj 	  i = 0;
110938fd1498Szrj 	  t = i;
111038fd1498Szrj 	  current_pass = 2;
111138fd1498Szrj 	  if (current_partition == BB_HOT_PARTITION)
111238fd1498Szrj 	    current_partition = BB_COLD_PARTITION;
111338fd1498Szrj 	  else
111438fd1498Szrj 	    current_partition = BB_HOT_PARTITION;
111538fd1498Szrj 	}
111638fd1498Szrj 
111738fd1498Szrj       if (connected[t])
111838fd1498Szrj 	continue;
111938fd1498Szrj 
112038fd1498Szrj       if (two_passes
112138fd1498Szrj 	  && BB_PARTITION (traces[t].first) != current_partition)
112238fd1498Szrj 	continue;
112338fd1498Szrj 
112438fd1498Szrj       connected[t] = true;
112538fd1498Szrj 
112638fd1498Szrj       /* Find the predecessor traces.  */
112738fd1498Szrj       for (t2 = t; t2 > 0;)
112838fd1498Szrj 	{
112938fd1498Szrj 	  edge_iterator ei;
113038fd1498Szrj 	  best = NULL;
113138fd1498Szrj 	  best_len = 0;
113238fd1498Szrj 	  FOR_EACH_EDGE (e, ei, traces[t2].first->preds)
113338fd1498Szrj 	    {
113438fd1498Szrj 	      int si = e->src->index;
113538fd1498Szrj 
113638fd1498Szrj 	      if (e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)
113738fd1498Szrj 		  && (e->flags & EDGE_CAN_FALLTHRU)
113838fd1498Szrj 		  && !(e->flags & EDGE_COMPLEX)
113938fd1498Szrj 		  && bbd[si].end_of_trace >= 0
114038fd1498Szrj 		  && !connected[bbd[si].end_of_trace]
114138fd1498Szrj 		  && (BB_PARTITION (e->src) == current_partition)
114238fd1498Szrj 		  && connect_better_edge_p (e, true, best_len, best, traces))
114338fd1498Szrj 		{
114438fd1498Szrj 		  best = e;
114538fd1498Szrj 		  best_len = traces[bbd[si].end_of_trace].length;
114638fd1498Szrj 		}
114738fd1498Szrj 	    }
114838fd1498Szrj 	  if (best)
114938fd1498Szrj 	    {
115038fd1498Szrj 	      best->src->aux = best->dest;
115138fd1498Szrj 	      t2 = bbd[best->src->index].end_of_trace;
115238fd1498Szrj 	      connected[t2] = true;
115338fd1498Szrj 
115438fd1498Szrj 	      if (dump_file)
115538fd1498Szrj 		{
115638fd1498Szrj 		  fprintf (dump_file, "Connection: %d %d\n",
115738fd1498Szrj 			   best->src->index, best->dest->index);
115838fd1498Szrj 		}
115938fd1498Szrj 	    }
116038fd1498Szrj 	  else
116138fd1498Szrj 	    break;
116238fd1498Szrj 	}
116338fd1498Szrj 
116438fd1498Szrj       if (last_trace >= 0)
116538fd1498Szrj 	traces[last_trace].last->aux = traces[t2].first;
116638fd1498Szrj       last_trace = t;
116738fd1498Szrj 
116838fd1498Szrj       /* Find the successor traces.  */
116938fd1498Szrj       while (1)
117038fd1498Szrj 	{
117138fd1498Szrj 	  /* Find the continuation of the chain.  */
117238fd1498Szrj 	  edge_iterator ei;
117338fd1498Szrj 	  best = NULL;
117438fd1498Szrj 	  best_len = 0;
117538fd1498Szrj 	  FOR_EACH_EDGE (e, ei, traces[t].last->succs)
117638fd1498Szrj 	    {
117738fd1498Szrj 	      int di = e->dest->index;
117838fd1498Szrj 
117938fd1498Szrj 	      if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
118038fd1498Szrj 		  && (e->flags & EDGE_CAN_FALLTHRU)
118138fd1498Szrj 		  && !(e->flags & EDGE_COMPLEX)
118238fd1498Szrj 		  && bbd[di].start_of_trace >= 0
118338fd1498Szrj 		  && !connected[bbd[di].start_of_trace]
118438fd1498Szrj 		  && (BB_PARTITION (e->dest) == current_partition)
118538fd1498Szrj 		  && connect_better_edge_p (e, false, best_len, best, traces))
118638fd1498Szrj 		{
118738fd1498Szrj 		  best = e;
118838fd1498Szrj 		  best_len = traces[bbd[di].start_of_trace].length;
118938fd1498Szrj 		}
119038fd1498Szrj 	    }
119138fd1498Szrj 
119238fd1498Szrj 	  if (for_size)
119338fd1498Szrj 	    {
119438fd1498Szrj 	      if (!best)
119538fd1498Szrj 		/* Stop finding the successor traces.  */
119638fd1498Szrj 		break;
119738fd1498Szrj 
119838fd1498Szrj 	      /* It is OK to connect block n with block n + 1 or a block
119938fd1498Szrj 		 before n.  For others, only connect to the loop header.  */
120038fd1498Szrj 	      if (best->dest->index > (traces[t].last->index + 1))
120138fd1498Szrj 		{
120238fd1498Szrj 		  int count = EDGE_COUNT (best->dest->preds);
120338fd1498Szrj 
120438fd1498Szrj 		  FOR_EACH_EDGE (e, ei, best->dest->preds)
120538fd1498Szrj 		    if (e->flags & EDGE_DFS_BACK)
120638fd1498Szrj 		      count--;
120738fd1498Szrj 
120838fd1498Szrj 		  /* If dest has multiple predecessors, skip it.  We expect
120938fd1498Szrj 		     that one predecessor with smaller index connects with it
121038fd1498Szrj 		     later.  */
121138fd1498Szrj 		  if (count != 1)
121238fd1498Szrj 		    break;
121338fd1498Szrj 		}
121438fd1498Szrj 
121538fd1498Szrj 	      /* Only connect Trace n with Trace n + 1.  It is conservative
121638fd1498Szrj 		 to keep the order as close as possible to the original order.
121738fd1498Szrj 		 It also helps to reduce long jumps.  */
121838fd1498Szrj 	      if (last_trace != bbd[best->dest->index].start_of_trace - 1)
121938fd1498Szrj 		break;
122038fd1498Szrj 
122138fd1498Szrj 	      if (dump_file)
122238fd1498Szrj 		fprintf (dump_file, "Connection: %d %d\n",
122338fd1498Szrj 			 best->src->index, best->dest->index);
122438fd1498Szrj 
122538fd1498Szrj 	      t = bbd[best->dest->index].start_of_trace;
122638fd1498Szrj 	      traces[last_trace].last->aux = traces[t].first;
122738fd1498Szrj 	      connected[t] = true;
122838fd1498Szrj 	      last_trace = t;
122938fd1498Szrj 	    }
123038fd1498Szrj 	  else if (best)
123138fd1498Szrj 	    {
123238fd1498Szrj 	      if (dump_file)
123338fd1498Szrj 		{
123438fd1498Szrj 		  fprintf (dump_file, "Connection: %d %d\n",
123538fd1498Szrj 			   best->src->index, best->dest->index);
123638fd1498Szrj 		}
123738fd1498Szrj 	      t = bbd[best->dest->index].start_of_trace;
123838fd1498Szrj 	      traces[last_trace].last->aux = traces[t].first;
123938fd1498Szrj 	      connected[t] = true;
124038fd1498Szrj 	      last_trace = t;
124138fd1498Szrj 	    }
124238fd1498Szrj 	  else
124338fd1498Szrj 	    {
124438fd1498Szrj 	      /* Try to connect the traces by duplication of 1 block.  */
124538fd1498Szrj 	      edge e2;
124638fd1498Szrj 	      basic_block next_bb = NULL;
124738fd1498Szrj 	      bool try_copy = false;
124838fd1498Szrj 
124938fd1498Szrj 	      FOR_EACH_EDGE (e, ei, traces[t].last->succs)
125038fd1498Szrj 		if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
125138fd1498Szrj 		    && (e->flags & EDGE_CAN_FALLTHRU)
125238fd1498Szrj 		    && !(e->flags & EDGE_COMPLEX)
125338fd1498Szrj 		    && (!best || e->probability > best->probability))
125438fd1498Szrj 		  {
125538fd1498Szrj 		    edge_iterator ei;
125638fd1498Szrj 		    edge best2 = NULL;
125738fd1498Szrj 		    int best2_len = 0;
125838fd1498Szrj 
125938fd1498Szrj 		    /* If the destination is a start of a trace which is only
126038fd1498Szrj 		       one block long, then no need to search the successor
126138fd1498Szrj 		       blocks of the trace.  Accept it.  */
126238fd1498Szrj 		    if (bbd[e->dest->index].start_of_trace >= 0
126338fd1498Szrj 			&& traces[bbd[e->dest->index].start_of_trace].length
126438fd1498Szrj 			   == 1)
126538fd1498Szrj 		      {
126638fd1498Szrj 			best = e;
126738fd1498Szrj 			try_copy = true;
126838fd1498Szrj 			continue;
126938fd1498Szrj 		      }
127038fd1498Szrj 
127138fd1498Szrj 		    FOR_EACH_EDGE (e2, ei, e->dest->succs)
127238fd1498Szrj 		      {
127338fd1498Szrj 			int di = e2->dest->index;
127438fd1498Szrj 
127538fd1498Szrj 			if (e2->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
127638fd1498Szrj 			    || ((e2->flags & EDGE_CAN_FALLTHRU)
127738fd1498Szrj 				&& !(e2->flags & EDGE_COMPLEX)
127838fd1498Szrj 				&& bbd[di].start_of_trace >= 0
127938fd1498Szrj 				&& !connected[bbd[di].start_of_trace]
128038fd1498Szrj 				&& BB_PARTITION (e2->dest) == current_partition
128138fd1498Szrj 				&& e2->count () >= count_threshold
128238fd1498Szrj 				&& (!best2
128338fd1498Szrj 				    || e2->probability > best2->probability
128438fd1498Szrj 				    || (e2->probability == best2->probability
128538fd1498Szrj 					&& traces[bbd[di].start_of_trace].length
128638fd1498Szrj 					   > best2_len))))
128738fd1498Szrj 			  {
128838fd1498Szrj 			    best = e;
128938fd1498Szrj 			    best2 = e2;
129038fd1498Szrj 			    if (e2->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
129138fd1498Szrj 			      best2_len = traces[bbd[di].start_of_trace].length;
129238fd1498Szrj 			    else
129338fd1498Szrj 			      best2_len = INT_MAX;
129438fd1498Szrj 			    next_bb = e2->dest;
129538fd1498Szrj 			    try_copy = true;
129638fd1498Szrj 			  }
129738fd1498Szrj 		      }
129838fd1498Szrj 		  }
129938fd1498Szrj 
130038fd1498Szrj 	      /* Copy tiny blocks always; copy larger blocks only when the
130138fd1498Szrj 		 edge is traversed frequently enough.  */
130238fd1498Szrj 	      if (try_copy
130338fd1498Szrj 		  && BB_PARTITION (best->src) == BB_PARTITION (best->dest)
130438fd1498Szrj 		  && copy_bb_p (best->dest,
130538fd1498Szrj 				optimize_edge_for_speed_p (best)
130638fd1498Szrj 				&& (!best->count ().initialized_p ()
130738fd1498Szrj 				    || best->count () >= count_threshold)))
130838fd1498Szrj 		{
130938fd1498Szrj 		  basic_block new_bb;
131038fd1498Szrj 
131138fd1498Szrj 		  if (dump_file)
131238fd1498Szrj 		    {
131338fd1498Szrj 		      fprintf (dump_file, "Connection: %d %d ",
131438fd1498Szrj 			       traces[t].last->index, best->dest->index);
131538fd1498Szrj 		      if (!next_bb)
131638fd1498Szrj 			fputc ('\n', dump_file);
131738fd1498Szrj 		      else if (next_bb == EXIT_BLOCK_PTR_FOR_FN (cfun))
131838fd1498Szrj 			fprintf (dump_file, "exit\n");
131938fd1498Szrj 		      else
132038fd1498Szrj 			fprintf (dump_file, "%d\n", next_bb->index);
132138fd1498Szrj 		    }
132238fd1498Szrj 
132338fd1498Szrj 		  new_bb = copy_bb (best->dest, best, traces[t].last, t);
132438fd1498Szrj 		  traces[t].last = new_bb;
132538fd1498Szrj 		  if (next_bb && next_bb != EXIT_BLOCK_PTR_FOR_FN (cfun))
132638fd1498Szrj 		    {
132738fd1498Szrj 		      t = bbd[next_bb->index].start_of_trace;
132838fd1498Szrj 		      traces[last_trace].last->aux = traces[t].first;
132938fd1498Szrj 		      connected[t] = true;
133038fd1498Szrj 		      last_trace = t;
133138fd1498Szrj 		    }
133238fd1498Szrj 		  else
133338fd1498Szrj 		    break;	/* Stop finding the successor traces.  */
133438fd1498Szrj 		}
133538fd1498Szrj 	      else
133638fd1498Szrj 		break;	/* Stop finding the successor traces.  */
133738fd1498Szrj 	    }
133838fd1498Szrj 	}
133938fd1498Szrj     }
134038fd1498Szrj 
134138fd1498Szrj   if (dump_file)
134238fd1498Szrj     {
134338fd1498Szrj       basic_block bb;
134438fd1498Szrj 
134538fd1498Szrj       fprintf (dump_file, "Final order:\n");
134638fd1498Szrj       for (bb = traces[0].first; bb; bb = (basic_block) bb->aux)
134738fd1498Szrj 	fprintf (dump_file, "%d ", bb->index);
134838fd1498Szrj       fprintf (dump_file, "\n");
134938fd1498Szrj       fflush (dump_file);
135038fd1498Szrj     }
135138fd1498Szrj 
135238fd1498Szrj   FREE (connected);
135338fd1498Szrj }
135438fd1498Szrj 
135538fd1498Szrj /* Return true when BB can and should be copied. CODE_MAY_GROW is true
135638fd1498Szrj    when code size is allowed to grow by duplication.  */
135738fd1498Szrj 
135838fd1498Szrj static bool
copy_bb_p(const_basic_block bb,int code_may_grow)135938fd1498Szrj copy_bb_p (const_basic_block bb, int code_may_grow)
136038fd1498Szrj {
136138fd1498Szrj   int size = 0;
136238fd1498Szrj   int max_size = uncond_jump_length;
136338fd1498Szrj   rtx_insn *insn;
136438fd1498Szrj 
136538fd1498Szrj   if (EDGE_COUNT (bb->preds) < 2)
136638fd1498Szrj     return false;
136738fd1498Szrj   if (!can_duplicate_block_p (bb))
136838fd1498Szrj     return false;
136938fd1498Szrj 
137038fd1498Szrj   /* Avoid duplicating blocks which have many successors (PR/13430).  */
137138fd1498Szrj   if (EDGE_COUNT (bb->succs) > 8)
137238fd1498Szrj     return false;
137338fd1498Szrj 
137438fd1498Szrj   if (code_may_grow && optimize_bb_for_speed_p (bb))
137538fd1498Szrj     max_size *= PARAM_VALUE (PARAM_MAX_GROW_COPY_BB_INSNS);
137638fd1498Szrj 
137738fd1498Szrj   FOR_BB_INSNS (bb, insn)
137838fd1498Szrj     {
137938fd1498Szrj       if (INSN_P (insn))
138038fd1498Szrj 	size += get_attr_min_length (insn);
138138fd1498Szrj     }
138238fd1498Szrj 
138338fd1498Szrj   if (size <= max_size)
138438fd1498Szrj     return true;
138538fd1498Szrj 
138638fd1498Szrj   if (dump_file)
138738fd1498Szrj     {
138838fd1498Szrj       fprintf (dump_file,
138938fd1498Szrj 	       "Block %d can't be copied because its size = %d.\n",
139038fd1498Szrj 	       bb->index, size);
139138fd1498Szrj     }
139238fd1498Szrj 
139338fd1498Szrj   return false;
139438fd1498Szrj }
139538fd1498Szrj 
139638fd1498Szrj /* Return the length of unconditional jump instruction.  */
139738fd1498Szrj 
139838fd1498Szrj int
get_uncond_jump_length(void)139938fd1498Szrj get_uncond_jump_length (void)
140038fd1498Szrj {
140138fd1498Szrj   int length;
140238fd1498Szrj 
140338fd1498Szrj   start_sequence ();
140438fd1498Szrj   rtx_code_label *label = emit_label (gen_label_rtx ());
140538fd1498Szrj   rtx_insn *jump = emit_jump_insn (targetm.gen_jump (label));
140638fd1498Szrj   length = get_attr_min_length (jump);
140738fd1498Szrj   end_sequence ();
140838fd1498Szrj 
140938fd1498Szrj   return length;
141038fd1498Szrj }
141138fd1498Szrj 
1412*58e805e6Szrj /* Create a forwarder block to OLD_BB starting with NEW_LABEL and in the
1413*58e805e6Szrj    other partition wrt OLD_BB.  */
1414*58e805e6Szrj 
1415*58e805e6Szrj static basic_block
create_eh_forwarder_block(rtx_code_label * new_label,basic_block old_bb)1416*58e805e6Szrj create_eh_forwarder_block (rtx_code_label *new_label, basic_block old_bb)
1417*58e805e6Szrj {
1418*58e805e6Szrj   /* Split OLD_BB, so that EH pads have always only incoming EH edges,
1419*58e805e6Szrj      bb_has_eh_pred bbs are treated specially by DF infrastructure.  */
1420*58e805e6Szrj   old_bb = split_block_after_labels (old_bb)->dest;
1421*58e805e6Szrj 
1422*58e805e6Szrj   /* Put the new label and a jump in the new basic block.  */
1423*58e805e6Szrj   rtx_insn *label = emit_label (new_label);
1424*58e805e6Szrj   rtx_code_label *old_label = block_label (old_bb);
1425*58e805e6Szrj   rtx_insn *jump = emit_jump_insn (targetm.gen_jump (old_label));
1426*58e805e6Szrj   JUMP_LABEL (jump) = old_label;
1427*58e805e6Szrj 
1428*58e805e6Szrj   /* Create the new basic block and put it in last position.  */
1429*58e805e6Szrj   basic_block last_bb = EXIT_BLOCK_PTR_FOR_FN (cfun)->prev_bb;
1430*58e805e6Szrj   basic_block new_bb = create_basic_block (label, jump, last_bb);
1431*58e805e6Szrj   new_bb->aux = last_bb->aux;
1432*58e805e6Szrj   new_bb->count = old_bb->count;
1433*58e805e6Szrj   last_bb->aux = new_bb;
1434*58e805e6Szrj 
1435*58e805e6Szrj   emit_barrier_after_bb (new_bb);
1436*58e805e6Szrj 
1437*58e805e6Szrj   make_single_succ_edge (new_bb, old_bb, 0);
1438*58e805e6Szrj 
1439*58e805e6Szrj   /* Make sure the new basic block is in the other partition.  */
1440*58e805e6Szrj   unsigned new_partition = BB_PARTITION (old_bb);
1441*58e805e6Szrj   new_partition ^= BB_HOT_PARTITION | BB_COLD_PARTITION;
1442*58e805e6Szrj   BB_SET_PARTITION (new_bb, new_partition);
1443*58e805e6Szrj 
1444*58e805e6Szrj   return new_bb;
1445*58e805e6Szrj }
1446*58e805e6Szrj 
1447*58e805e6Szrj /* The common landing pad in block OLD_BB has edges from both partitions.
1448*58e805e6Szrj    Add a new landing pad that will just jump to the old one and split the
1449*58e805e6Szrj    edges so that no EH edge crosses partitions.  */
1450*58e805e6Szrj 
1451*58e805e6Szrj static void
sjlj_fix_up_crossing_landing_pad(basic_block old_bb)1452*58e805e6Szrj sjlj_fix_up_crossing_landing_pad (basic_block old_bb)
1453*58e805e6Szrj {
1454*58e805e6Szrj   const unsigned lp_len = cfun->eh->lp_array->length ();
1455*58e805e6Szrj   edge_iterator ei;
1456*58e805e6Szrj   edge e;
1457*58e805e6Szrj 
1458*58e805e6Szrj   /* Generate the new common landing-pad label.  */
1459*58e805e6Szrj   rtx_code_label *new_label = gen_label_rtx ();
1460*58e805e6Szrj   LABEL_PRESERVE_P (new_label) = 1;
1461*58e805e6Szrj 
1462*58e805e6Szrj   /* Create the forwarder block.  */
1463*58e805e6Szrj   basic_block new_bb = create_eh_forwarder_block (new_label, old_bb);
1464*58e805e6Szrj 
1465*58e805e6Szrj   /* Create the map from old to new lp index and initialize it.  */
1466*58e805e6Szrj   unsigned *index_map = (unsigned *) alloca (lp_len * sizeof (unsigned));
1467*58e805e6Szrj   memset (index_map, 0, lp_len * sizeof (unsigned));
1468*58e805e6Szrj 
1469*58e805e6Szrj   /* Fix up the edges.  */
1470*58e805e6Szrj   for (ei = ei_start (old_bb->preds); (e = ei_safe_edge (ei)) != NULL; )
1471*58e805e6Szrj     if (e->src != new_bb && BB_PARTITION (e->src) == BB_PARTITION (new_bb))
1472*58e805e6Szrj       {
1473*58e805e6Szrj 	rtx_insn *insn = BB_END (e->src);
1474*58e805e6Szrj 	rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
1475*58e805e6Szrj 
1476*58e805e6Szrj 	gcc_assert (note != NULL);
1477*58e805e6Szrj 	const unsigned old_index = INTVAL (XEXP (note, 0));
1478*58e805e6Szrj 
1479*58e805e6Szrj 	/* Generate the new landing-pad structure.  */
1480*58e805e6Szrj 	if (index_map[old_index] == 0)
1481*58e805e6Szrj 	  {
1482*58e805e6Szrj 	    eh_landing_pad old_lp = (*cfun->eh->lp_array)[old_index];
1483*58e805e6Szrj 	    eh_landing_pad new_lp = gen_eh_landing_pad (old_lp->region);
1484*58e805e6Szrj 	    new_lp->post_landing_pad = old_lp->post_landing_pad;
1485*58e805e6Szrj 	    new_lp->landing_pad = new_label;
1486*58e805e6Szrj 	    index_map[old_index] = new_lp->index;
1487*58e805e6Szrj 	  }
1488*58e805e6Szrj 	XEXP (note, 0) = GEN_INT (index_map[old_index]);
1489*58e805e6Szrj 
1490*58e805e6Szrj 	/* Adjust the edge to the new destination.  */
1491*58e805e6Szrj 	redirect_edge_succ (e, new_bb);
1492*58e805e6Szrj       }
1493*58e805e6Szrj     else
1494*58e805e6Szrj       ei_next (&ei);
1495*58e805e6Szrj }
1496*58e805e6Szrj 
149738fd1498Szrj /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions.
149838fd1498Szrj    Add a new landing pad that will just jump to the old one and split the
149938fd1498Szrj    edges so that no EH edge crosses partitions.  */
150038fd1498Szrj 
150138fd1498Szrj static void
dw2_fix_up_crossing_landing_pad(eh_landing_pad old_lp,basic_block old_bb)1502*58e805e6Szrj dw2_fix_up_crossing_landing_pad (eh_landing_pad old_lp, basic_block old_bb)
150338fd1498Szrj {
150438fd1498Szrj   eh_landing_pad new_lp;
150538fd1498Szrj   edge_iterator ei;
150638fd1498Szrj   edge e;
150738fd1498Szrj 
150838fd1498Szrj   /* Generate the new landing-pad structure.  */
150938fd1498Szrj   new_lp = gen_eh_landing_pad (old_lp->region);
151038fd1498Szrj   new_lp->post_landing_pad = old_lp->post_landing_pad;
151138fd1498Szrj   new_lp->landing_pad = gen_label_rtx ();
151238fd1498Szrj   LABEL_PRESERVE_P (new_lp->landing_pad) = 1;
151338fd1498Szrj 
1514*58e805e6Szrj   /* Create the forwarder block.  */
1515*58e805e6Szrj   basic_block new_bb = create_eh_forwarder_block (new_lp->landing_pad, old_bb);
151638fd1498Szrj 
151738fd1498Szrj   /* Fix up the edges.  */
151838fd1498Szrj   for (ei = ei_start (old_bb->preds); (e = ei_safe_edge (ei)) != NULL; )
1519*58e805e6Szrj     if (e->src != new_bb && BB_PARTITION (e->src) == BB_PARTITION (new_bb))
152038fd1498Szrj       {
152138fd1498Szrj 	rtx_insn *insn = BB_END (e->src);
152238fd1498Szrj 	rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
152338fd1498Szrj 
152438fd1498Szrj 	gcc_assert (note != NULL);
152538fd1498Szrj 	gcc_checking_assert (INTVAL (XEXP (note, 0)) == old_lp->index);
152638fd1498Szrj 	XEXP (note, 0) = GEN_INT (new_lp->index);
152738fd1498Szrj 
152838fd1498Szrj 	/* Adjust the edge to the new destination.  */
152938fd1498Szrj 	redirect_edge_succ (e, new_bb);
153038fd1498Szrj       }
153138fd1498Szrj     else
153238fd1498Szrj       ei_next (&ei);
153338fd1498Szrj }
153438fd1498Szrj 
153538fd1498Szrj 
153638fd1498Szrj /* Ensure that all hot bbs are included in a hot path through the
153738fd1498Szrj    procedure. This is done by calling this function twice, once
153838fd1498Szrj    with WALK_UP true (to look for paths from the entry to hot bbs) and
153938fd1498Szrj    once with WALK_UP false (to look for paths from hot bbs to the exit).
154038fd1498Szrj    Returns the updated value of COLD_BB_COUNT and adds newly-hot bbs
154138fd1498Szrj    to BBS_IN_HOT_PARTITION.  */
154238fd1498Szrj 
154338fd1498Szrj static unsigned int
sanitize_hot_paths(bool walk_up,unsigned int cold_bb_count,vec<basic_block> * bbs_in_hot_partition)154438fd1498Szrj sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count,
154538fd1498Szrj                     vec<basic_block> *bbs_in_hot_partition)
154638fd1498Szrj {
154738fd1498Szrj   /* Callers check this.  */
154838fd1498Szrj   gcc_checking_assert (cold_bb_count);
154938fd1498Szrj 
155038fd1498Szrj   /* Keep examining hot bbs while we still have some left to check
155138fd1498Szrj      and there are remaining cold bbs.  */
155238fd1498Szrj   vec<basic_block> hot_bbs_to_check = bbs_in_hot_partition->copy ();
155338fd1498Szrj   while (! hot_bbs_to_check.is_empty ()
155438fd1498Szrj          && cold_bb_count)
155538fd1498Szrj     {
155638fd1498Szrj       basic_block bb = hot_bbs_to_check.pop ();
155738fd1498Szrj       vec<edge, va_gc> *edges = walk_up ? bb->preds : bb->succs;
155838fd1498Szrj       edge e;
155938fd1498Szrj       edge_iterator ei;
156038fd1498Szrj       profile_probability highest_probability
156138fd1498Szrj 				 = profile_probability::uninitialized ();
156238fd1498Szrj       profile_count highest_count = profile_count::uninitialized ();
156338fd1498Szrj       bool found = false;
156438fd1498Szrj 
156538fd1498Szrj       /* Walk the preds/succs and check if there is at least one already
156638fd1498Szrj          marked hot. Keep track of the most frequent pred/succ so that we
156738fd1498Szrj          can mark it hot if we don't find one.  */
156838fd1498Szrj       FOR_EACH_EDGE (e, ei, edges)
156938fd1498Szrj         {
157038fd1498Szrj           basic_block reach_bb = walk_up ? e->src : e->dest;
157138fd1498Szrj 
157238fd1498Szrj           if (e->flags & EDGE_DFS_BACK)
157338fd1498Szrj             continue;
157438fd1498Szrj 
157538fd1498Szrj 	  /* Do not expect profile insanities when profile was not adjusted.  */
157638fd1498Szrj 	  if (e->probability == profile_probability::never ()
157738fd1498Szrj 	      || e->count () == profile_count::zero ())
157838fd1498Szrj 	    continue;
157938fd1498Szrj 
158038fd1498Szrj           if (BB_PARTITION (reach_bb) != BB_COLD_PARTITION)
158138fd1498Szrj           {
158238fd1498Szrj             found = true;
158338fd1498Szrj             break;
158438fd1498Szrj           }
158538fd1498Szrj           /* The following loop will look for the hottest edge via
158638fd1498Szrj              the edge count, if it is non-zero, then fallback to
158738fd1498Szrj              the edge probability.  */
158838fd1498Szrj           if (!(e->count () > highest_count))
158938fd1498Szrj             highest_count = e->count ();
159038fd1498Szrj           if (!highest_probability.initialized_p ()
159138fd1498Szrj 	      || e->probability > highest_probability)
159238fd1498Szrj             highest_probability = e->probability;
159338fd1498Szrj         }
159438fd1498Szrj 
159538fd1498Szrj       /* If bb is reached by (or reaches, in the case of !WALK_UP) another hot
159638fd1498Szrj          block (or unpartitioned, e.g. the entry block) then it is ok. If not,
159738fd1498Szrj          then the most frequent pred (or succ) needs to be adjusted.  In the
159838fd1498Szrj          case where multiple preds/succs have the same frequency (e.g. a
159938fd1498Szrj          50-50 branch), then both will be adjusted.  */
160038fd1498Szrj       if (found)
160138fd1498Szrj         continue;
160238fd1498Szrj 
160338fd1498Szrj       FOR_EACH_EDGE (e, ei, edges)
160438fd1498Szrj         {
160538fd1498Szrj           if (e->flags & EDGE_DFS_BACK)
160638fd1498Szrj             continue;
160738fd1498Szrj 	  /* Do not expect profile insanities when profile was not adjusted.  */
160838fd1498Szrj 	  if (e->probability == profile_probability::never ()
160938fd1498Szrj 	      || e->count () == profile_count::zero ())
161038fd1498Szrj 	    continue;
161138fd1498Szrj           /* Select the hottest edge using the edge count, if it is non-zero,
161238fd1498Szrj              then fallback to the edge probability.  */
161338fd1498Szrj           if (highest_count.initialized_p ())
161438fd1498Szrj             {
161538fd1498Szrj               if (!(e->count () >= highest_count))
161638fd1498Szrj                 continue;
161738fd1498Szrj             }
161838fd1498Szrj           else if (!(e->probability >= highest_probability))
161938fd1498Szrj             continue;
162038fd1498Szrj 
162138fd1498Szrj           basic_block reach_bb = walk_up ? e->src : e->dest;
162238fd1498Szrj 
162338fd1498Szrj           /* We have a hot bb with an immediate dominator that is cold.
162438fd1498Szrj              The dominator needs to be re-marked hot.  */
162538fd1498Szrj           BB_SET_PARTITION (reach_bb, BB_HOT_PARTITION);
162638fd1498Szrj 	  if (dump_file)
162738fd1498Szrj 	    fprintf (dump_file, "Promoting bb %i to hot partition to sanitize "
162838fd1498Szrj 		     "profile of bb %i in %s walk\n", reach_bb->index,
162938fd1498Szrj 		     bb->index, walk_up ? "backward" : "forward");
163038fd1498Szrj           cold_bb_count--;
163138fd1498Szrj 
163238fd1498Szrj           /* Now we need to examine newly-hot reach_bb to see if it is also
163338fd1498Szrj              dominated by a cold bb.  */
163438fd1498Szrj           bbs_in_hot_partition->safe_push (reach_bb);
163538fd1498Szrj           hot_bbs_to_check.safe_push (reach_bb);
163638fd1498Szrj         }
163738fd1498Szrj     }
1638*58e805e6Szrj   hot_bbs_to_check.release ();
163938fd1498Szrj 
164038fd1498Szrj   return cold_bb_count;
164138fd1498Szrj }
164238fd1498Szrj 
164338fd1498Szrj 
164438fd1498Szrj /* Find the basic blocks that are rarely executed and need to be moved to
164538fd1498Szrj    a separate section of the .o file (to cut down on paging and improve
164638fd1498Szrj    cache locality).  Return a vector of all edges that cross.  */
164738fd1498Szrj 
164838fd1498Szrj static vec<edge>
find_rarely_executed_basic_blocks_and_crossing_edges(void)164938fd1498Szrj find_rarely_executed_basic_blocks_and_crossing_edges (void)
165038fd1498Szrj {
165138fd1498Szrj   vec<edge> crossing_edges = vNULL;
165238fd1498Szrj   basic_block bb;
165338fd1498Szrj   edge e;
165438fd1498Szrj   edge_iterator ei;
165538fd1498Szrj   unsigned int cold_bb_count = 0;
165638fd1498Szrj   auto_vec<basic_block> bbs_in_hot_partition;
165738fd1498Szrj 
165838fd1498Szrj   propagate_unlikely_bbs_forward ();
165938fd1498Szrj 
166038fd1498Szrj   /* Mark which partition (hot/cold) each basic block belongs in.  */
166138fd1498Szrj   FOR_EACH_BB_FN (bb, cfun)
166238fd1498Szrj     {
166338fd1498Szrj       bool cold_bb = false;
166438fd1498Szrj 
166538fd1498Szrj       if (probably_never_executed_bb_p (cfun, bb))
166638fd1498Szrj         {
166738fd1498Szrj           /* Handle profile insanities created by upstream optimizations
166838fd1498Szrj              by also checking the incoming edge weights. If there is a non-cold
166938fd1498Szrj              incoming edge, conservatively prevent this block from being split
167038fd1498Szrj              into the cold section.  */
167138fd1498Szrj           cold_bb = true;
167238fd1498Szrj           FOR_EACH_EDGE (e, ei, bb->preds)
167338fd1498Szrj             if (!probably_never_executed_edge_p (cfun, e))
167438fd1498Szrj               {
167538fd1498Szrj                 cold_bb = false;
167638fd1498Szrj                 break;
167738fd1498Szrj               }
167838fd1498Szrj         }
167938fd1498Szrj       if (cold_bb)
168038fd1498Szrj         {
168138fd1498Szrj           BB_SET_PARTITION (bb, BB_COLD_PARTITION);
168238fd1498Szrj           cold_bb_count++;
168338fd1498Szrj         }
168438fd1498Szrj       else
168538fd1498Szrj         {
168638fd1498Szrj           BB_SET_PARTITION (bb, BB_HOT_PARTITION);
168738fd1498Szrj           bbs_in_hot_partition.safe_push (bb);
168838fd1498Szrj         }
168938fd1498Szrj     }
169038fd1498Szrj 
169138fd1498Szrj   /* Ensure that hot bbs are included along a hot path from the entry to exit.
169238fd1498Szrj      Several different possibilities may include cold bbs along all paths
169338fd1498Szrj      to/from a hot bb. One is that there are edge weight insanities
169438fd1498Szrj      due to optimization phases that do not properly update basic block profile
169538fd1498Szrj      counts. The second is that the entry of the function may not be hot, because
169638fd1498Szrj      it is entered fewer times than the number of profile training runs, but there
169738fd1498Szrj      is a loop inside the function that causes blocks within the function to be
169838fd1498Szrj      above the threshold for hotness. This is fixed by walking up from hot bbs
169938fd1498Szrj      to the entry block, and then down from hot bbs to the exit, performing
170038fd1498Szrj      partitioning fixups as necessary.  */
170138fd1498Szrj   if (cold_bb_count)
170238fd1498Szrj     {
170338fd1498Szrj       mark_dfs_back_edges ();
170438fd1498Szrj       cold_bb_count = sanitize_hot_paths (true, cold_bb_count,
170538fd1498Szrj                                           &bbs_in_hot_partition);
170638fd1498Szrj       if (cold_bb_count)
170738fd1498Szrj         sanitize_hot_paths (false, cold_bb_count, &bbs_in_hot_partition);
170838fd1498Szrj 
170938fd1498Szrj       hash_set <basic_block> set;
171038fd1498Szrj       find_bbs_reachable_by_hot_paths (&set);
171138fd1498Szrj       FOR_EACH_BB_FN (bb, cfun)
171238fd1498Szrj 	if (!set.contains (bb))
171338fd1498Szrj 	  BB_SET_PARTITION (bb, BB_COLD_PARTITION);
171438fd1498Szrj     }
171538fd1498Szrj 
171638fd1498Szrj   /* The format of .gcc_except_table does not allow landing pads to
171738fd1498Szrj      be in a different partition as the throw.  Fix this by either
1718*58e805e6Szrj      moving the landing pads or inserting forwarder landing pads.  */
171938fd1498Szrj   if (cfun->eh->lp_array)
172038fd1498Szrj     {
1721*58e805e6Szrj       const bool sjlj
1722*58e805e6Szrj 	= (targetm_common.except_unwind_info (&global_options) == UI_SJLJ);
172338fd1498Szrj       unsigned i;
172438fd1498Szrj       eh_landing_pad lp;
172538fd1498Szrj 
172638fd1498Szrj       FOR_EACH_VEC_ELT (*cfun->eh->lp_array, i, lp)
172738fd1498Szrj 	{
172838fd1498Szrj 	  bool all_same, all_diff;
172938fd1498Szrj 
173038fd1498Szrj 	  if (lp == NULL
173138fd1498Szrj 	      || lp->landing_pad == NULL_RTX
173238fd1498Szrj 	      || !LABEL_P (lp->landing_pad))
173338fd1498Szrj 	    continue;
173438fd1498Szrj 
173538fd1498Szrj 	  all_same = all_diff = true;
173638fd1498Szrj 	  bb = BLOCK_FOR_INSN (lp->landing_pad);
173738fd1498Szrj 	  FOR_EACH_EDGE (e, ei, bb->preds)
173838fd1498Szrj 	    {
173938fd1498Szrj 	      gcc_assert (e->flags & EDGE_EH);
174038fd1498Szrj 	      if (BB_PARTITION (bb) == BB_PARTITION (e->src))
174138fd1498Szrj 		all_diff = false;
174238fd1498Szrj 	      else
174338fd1498Szrj 		all_same = false;
174438fd1498Szrj 	    }
174538fd1498Szrj 
174638fd1498Szrj 	  if (all_same)
174738fd1498Szrj 	    ;
174838fd1498Szrj 	  else if (all_diff)
174938fd1498Szrj 	    {
175038fd1498Szrj 	      int which = BB_PARTITION (bb);
175138fd1498Szrj 	      which ^= BB_HOT_PARTITION | BB_COLD_PARTITION;
175238fd1498Szrj 	      BB_SET_PARTITION (bb, which);
175338fd1498Szrj 	    }
1754*58e805e6Szrj 	  else if (sjlj)
1755*58e805e6Szrj 	    sjlj_fix_up_crossing_landing_pad (bb);
175638fd1498Szrj 	  else
1757*58e805e6Szrj 	    dw2_fix_up_crossing_landing_pad (lp, bb);
1758*58e805e6Szrj 
1759*58e805e6Szrj 	  /* There is a single, common landing pad in SJLJ mode.  */
1760*58e805e6Szrj 	  if (sjlj)
1761*58e805e6Szrj 	    break;
176238fd1498Szrj 	}
176338fd1498Szrj     }
176438fd1498Szrj 
176538fd1498Szrj   /* Mark every edge that crosses between sections.  */
176638fd1498Szrj   FOR_EACH_BB_FN (bb, cfun)
176738fd1498Szrj     FOR_EACH_EDGE (e, ei, bb->succs)
176838fd1498Szrj       {
176938fd1498Szrj 	unsigned int flags = e->flags;
177038fd1498Szrj 
177138fd1498Szrj         /* We should never have EDGE_CROSSING set yet.  */
177238fd1498Szrj 	gcc_checking_assert ((flags & EDGE_CROSSING) == 0);
177338fd1498Szrj 
177438fd1498Szrj 	if (e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)
177538fd1498Szrj 	    && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
177638fd1498Szrj 	    && BB_PARTITION (e->src) != BB_PARTITION (e->dest))
177738fd1498Szrj 	  {
177838fd1498Szrj 	    crossing_edges.safe_push (e);
177938fd1498Szrj 	    flags |= EDGE_CROSSING;
178038fd1498Szrj 	  }
178138fd1498Szrj 
178238fd1498Szrj 	/* Now that we've split eh edges as appropriate, allow landing pads
178338fd1498Szrj 	   to be merged with the post-landing pads.  */
178438fd1498Szrj 	flags &= ~EDGE_PRESERVE;
178538fd1498Szrj 
178638fd1498Szrj 	e->flags = flags;
178738fd1498Szrj       }
178838fd1498Szrj 
178938fd1498Szrj   return crossing_edges;
179038fd1498Szrj }
179138fd1498Szrj 
179238fd1498Szrj /* Set the flag EDGE_CAN_FALLTHRU for edges that can be fallthru.  */
179338fd1498Szrj 
179438fd1498Szrj static void
set_edge_can_fallthru_flag(void)179538fd1498Szrj set_edge_can_fallthru_flag (void)
179638fd1498Szrj {
179738fd1498Szrj   basic_block bb;
179838fd1498Szrj 
179938fd1498Szrj   FOR_EACH_BB_FN (bb, cfun)
180038fd1498Szrj     {
180138fd1498Szrj       edge e;
180238fd1498Szrj       edge_iterator ei;
180338fd1498Szrj 
180438fd1498Szrj       FOR_EACH_EDGE (e, ei, bb->succs)
180538fd1498Szrj 	{
180638fd1498Szrj 	  e->flags &= ~EDGE_CAN_FALLTHRU;
180738fd1498Szrj 
180838fd1498Szrj 	  /* The FALLTHRU edge is also CAN_FALLTHRU edge.  */
180938fd1498Szrj 	  if (e->flags & EDGE_FALLTHRU)
181038fd1498Szrj 	    e->flags |= EDGE_CAN_FALLTHRU;
181138fd1498Szrj 	}
181238fd1498Szrj 
181338fd1498Szrj       /* If the BB ends with an invertible condjump all (2) edges are
181438fd1498Szrj 	 CAN_FALLTHRU edges.  */
181538fd1498Szrj       if (EDGE_COUNT (bb->succs) != 2)
181638fd1498Szrj 	continue;
181738fd1498Szrj       if (!any_condjump_p (BB_END (bb)))
181838fd1498Szrj 	continue;
181938fd1498Szrj 
182038fd1498Szrj       rtx_jump_insn *bb_end_jump = as_a <rtx_jump_insn *> (BB_END (bb));
182138fd1498Szrj       if (!invert_jump (bb_end_jump, JUMP_LABEL (bb_end_jump), 0))
182238fd1498Szrj 	continue;
182338fd1498Szrj       invert_jump (bb_end_jump, JUMP_LABEL (bb_end_jump), 0);
182438fd1498Szrj       EDGE_SUCC (bb, 0)->flags |= EDGE_CAN_FALLTHRU;
182538fd1498Szrj       EDGE_SUCC (bb, 1)->flags |= EDGE_CAN_FALLTHRU;
182638fd1498Szrj     }
182738fd1498Szrj }
182838fd1498Szrj 
182938fd1498Szrj /* If any destination of a crossing edge does not have a label, add label;
183038fd1498Szrj    Convert any easy fall-through crossing edges to unconditional jumps.  */
183138fd1498Szrj 
183238fd1498Szrj static void
add_labels_and_missing_jumps(vec<edge> crossing_edges)183338fd1498Szrj add_labels_and_missing_jumps (vec<edge> crossing_edges)
183438fd1498Szrj {
183538fd1498Szrj   size_t i;
183638fd1498Szrj   edge e;
183738fd1498Szrj 
183838fd1498Szrj   FOR_EACH_VEC_ELT (crossing_edges, i, e)
183938fd1498Szrj     {
184038fd1498Szrj       basic_block src = e->src;
184138fd1498Szrj       basic_block dest = e->dest;
184238fd1498Szrj       rtx_jump_insn *new_jump;
184338fd1498Szrj 
184438fd1498Szrj       if (dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
184538fd1498Szrj 	continue;
184638fd1498Szrj 
184738fd1498Szrj       /* Make sure dest has a label.  */
184838fd1498Szrj       rtx_code_label *label = block_label (dest);
184938fd1498Szrj 
185038fd1498Szrj       /* Nothing to do for non-fallthru edges.  */
185138fd1498Szrj       if (src == ENTRY_BLOCK_PTR_FOR_FN (cfun))
185238fd1498Szrj 	continue;
185338fd1498Szrj       if ((e->flags & EDGE_FALLTHRU) == 0)
185438fd1498Szrj 	continue;
185538fd1498Szrj 
185638fd1498Szrj       /* If the block does not end with a control flow insn, then we
185738fd1498Szrj 	 can trivially add a jump to the end to fixup the crossing.
185838fd1498Szrj 	 Otherwise the jump will have to go in a new bb, which will
185938fd1498Szrj 	 be handled by fix_up_fall_thru_edges function.  */
186038fd1498Szrj       if (control_flow_insn_p (BB_END (src)))
186138fd1498Szrj 	continue;
186238fd1498Szrj 
186338fd1498Szrj       /* Make sure there's only one successor.  */
186438fd1498Szrj       gcc_assert (single_succ_p (src));
186538fd1498Szrj 
186638fd1498Szrj       new_jump = emit_jump_insn_after (targetm.gen_jump (label), BB_END (src));
186738fd1498Szrj       BB_END (src) = new_jump;
186838fd1498Szrj       JUMP_LABEL (new_jump) = label;
186938fd1498Szrj       LABEL_NUSES (label) += 1;
187038fd1498Szrj 
187138fd1498Szrj       emit_barrier_after_bb (src);
187238fd1498Szrj 
187338fd1498Szrj       /* Mark edge as non-fallthru.  */
187438fd1498Szrj       e->flags &= ~EDGE_FALLTHRU;
187538fd1498Szrj     }
187638fd1498Szrj }
187738fd1498Szrj 
187838fd1498Szrj /* Find any bb's where the fall-through edge is a crossing edge (note that
187938fd1498Szrj    these bb's must also contain a conditional jump or end with a call
188038fd1498Szrj    instruction; we've already dealt with fall-through edges for blocks
188138fd1498Szrj    that didn't have a conditional jump or didn't end with call instruction
188238fd1498Szrj    in the call to add_labels_and_missing_jumps).  Convert the fall-through
188338fd1498Szrj    edge to non-crossing edge by inserting a new bb to fall-through into.
188438fd1498Szrj    The new bb will contain an unconditional jump (crossing edge) to the
188538fd1498Szrj    original fall through destination.  */
188638fd1498Szrj 
188738fd1498Szrj static void
fix_up_fall_thru_edges(void)188838fd1498Szrj fix_up_fall_thru_edges (void)
188938fd1498Szrj {
189038fd1498Szrj   basic_block cur_bb;
189138fd1498Szrj 
189238fd1498Szrj   FOR_EACH_BB_FN (cur_bb, cfun)
189338fd1498Szrj     {
189438fd1498Szrj       edge succ1;
189538fd1498Szrj       edge succ2;
189638fd1498Szrj       edge fall_thru = NULL;
189738fd1498Szrj       edge cond_jump = NULL;
189838fd1498Szrj 
189938fd1498Szrj       fall_thru = NULL;
190038fd1498Szrj       if (EDGE_COUNT (cur_bb->succs) > 0)
190138fd1498Szrj 	succ1 = EDGE_SUCC (cur_bb, 0);
190238fd1498Szrj       else
190338fd1498Szrj 	succ1 = NULL;
190438fd1498Szrj 
190538fd1498Szrj       if (EDGE_COUNT (cur_bb->succs) > 1)
190638fd1498Szrj 	succ2 = EDGE_SUCC (cur_bb, 1);
190738fd1498Szrj       else
190838fd1498Szrj 	succ2 = NULL;
190938fd1498Szrj 
191038fd1498Szrj       /* Find the fall-through edge.  */
191138fd1498Szrj 
191238fd1498Szrj       if (succ1
191338fd1498Szrj 	  && (succ1->flags & EDGE_FALLTHRU))
191438fd1498Szrj 	{
191538fd1498Szrj 	  fall_thru = succ1;
191638fd1498Szrj 	  cond_jump = succ2;
191738fd1498Szrj 	}
191838fd1498Szrj       else if (succ2
191938fd1498Szrj 	       && (succ2->flags & EDGE_FALLTHRU))
192038fd1498Szrj 	{
192138fd1498Szrj 	  fall_thru = succ2;
192238fd1498Szrj 	  cond_jump = succ1;
192338fd1498Szrj 	}
192438fd1498Szrj       else if (succ2 && EDGE_COUNT (cur_bb->succs) > 2)
192538fd1498Szrj 	fall_thru = find_fallthru_edge (cur_bb->succs);
192638fd1498Szrj 
192738fd1498Szrj       if (fall_thru && (fall_thru->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)))
192838fd1498Szrj 	{
192938fd1498Szrj 	  /* Check to see if the fall-thru edge is a crossing edge.  */
193038fd1498Szrj 
193138fd1498Szrj 	  if (fall_thru->flags & EDGE_CROSSING)
193238fd1498Szrj 	    {
193338fd1498Szrj 	      /* The fall_thru edge crosses; now check the cond jump edge, if
193438fd1498Szrj 		 it exists.  */
193538fd1498Szrj 
193638fd1498Szrj 	      bool cond_jump_crosses = true;
193738fd1498Szrj 	      int invert_worked = 0;
193838fd1498Szrj 	      rtx_insn *old_jump = BB_END (cur_bb);
193938fd1498Szrj 
194038fd1498Szrj 	      /* Find the jump instruction, if there is one.  */
194138fd1498Szrj 
194238fd1498Szrj 	      if (cond_jump)
194338fd1498Szrj 		{
194438fd1498Szrj 		  if (!(cond_jump->flags & EDGE_CROSSING))
194538fd1498Szrj 		    cond_jump_crosses = false;
194638fd1498Szrj 
194738fd1498Szrj 		  /* We know the fall-thru edge crosses; if the cond
194838fd1498Szrj 		     jump edge does NOT cross, and its destination is the
194938fd1498Szrj 		     next block in the bb order, invert the jump
195038fd1498Szrj 		     (i.e. fix it so the fall through does not cross and
195138fd1498Szrj 		     the cond jump does).  */
195238fd1498Szrj 
195338fd1498Szrj 		  if (!cond_jump_crosses)
195438fd1498Szrj 		    {
195538fd1498Szrj 		      /* Find label in fall_thru block. We've already added
195638fd1498Szrj 			 any missing labels, so there must be one.  */
195738fd1498Szrj 
195838fd1498Szrj 		      rtx_code_label *fall_thru_label
195938fd1498Szrj 			= block_label (fall_thru->dest);
196038fd1498Szrj 
196138fd1498Szrj 		      if (old_jump && fall_thru_label)
196238fd1498Szrj 			{
196338fd1498Szrj 			  rtx_jump_insn *old_jump_insn
196438fd1498Szrj 			    = dyn_cast <rtx_jump_insn *> (old_jump);
196538fd1498Szrj 			  if (old_jump_insn)
196638fd1498Szrj 			    invert_worked = invert_jump (old_jump_insn,
196738fd1498Szrj 							 fall_thru_label, 0);
196838fd1498Szrj 			}
196938fd1498Szrj 
197038fd1498Szrj 		      if (invert_worked)
197138fd1498Szrj 			{
197238fd1498Szrj 			  fall_thru->flags &= ~EDGE_FALLTHRU;
197338fd1498Szrj 			  cond_jump->flags |= EDGE_FALLTHRU;
197438fd1498Szrj 			  update_br_prob_note (cur_bb);
197538fd1498Szrj 			  std::swap (fall_thru, cond_jump);
197638fd1498Szrj 			  cond_jump->flags |= EDGE_CROSSING;
197738fd1498Szrj 			  fall_thru->flags &= ~EDGE_CROSSING;
197838fd1498Szrj 			}
197938fd1498Szrj 		    }
198038fd1498Szrj 		}
198138fd1498Szrj 
198238fd1498Szrj 	      if (cond_jump_crosses || !invert_worked)
198338fd1498Szrj 		{
198438fd1498Szrj 		  /* This is the case where both edges out of the basic
198538fd1498Szrj 		     block are crossing edges. Here we will fix up the
198638fd1498Szrj 		     fall through edge. The jump edge will be taken care
198738fd1498Szrj 		     of later.  The EDGE_CROSSING flag of fall_thru edge
198838fd1498Szrj 		     is unset before the call to force_nonfallthru
198938fd1498Szrj 		     function because if a new basic-block is created
199038fd1498Szrj 		     this edge remains in the current section boundary
199138fd1498Szrj 		     while the edge between new_bb and the fall_thru->dest
199238fd1498Szrj 		     becomes EDGE_CROSSING.  */
199338fd1498Szrj 
199438fd1498Szrj 		  fall_thru->flags &= ~EDGE_CROSSING;
199538fd1498Szrj 		  basic_block new_bb = force_nonfallthru (fall_thru);
199638fd1498Szrj 
199738fd1498Szrj 		  if (new_bb)
199838fd1498Szrj 		    {
199938fd1498Szrj 		      new_bb->aux = cur_bb->aux;
200038fd1498Szrj 		      cur_bb->aux = new_bb;
200138fd1498Szrj 
200238fd1498Szrj                       /* This is done by force_nonfallthru_and_redirect.  */
200338fd1498Szrj 		      gcc_assert (BB_PARTITION (new_bb)
200438fd1498Szrj                                   == BB_PARTITION (cur_bb));
200538fd1498Szrj 
200638fd1498Szrj 		      single_succ_edge (new_bb)->flags |= EDGE_CROSSING;
200738fd1498Szrj 		    }
200838fd1498Szrj 		  else
200938fd1498Szrj 		    {
201038fd1498Szrj 		      /* If a new basic-block was not created; restore
201138fd1498Szrj 			 the EDGE_CROSSING flag.  */
201238fd1498Szrj 		      fall_thru->flags |= EDGE_CROSSING;
201338fd1498Szrj 		    }
201438fd1498Szrj 
201538fd1498Szrj 		  /* Add barrier after new jump */
201638fd1498Szrj 		  emit_barrier_after_bb (new_bb ? new_bb : cur_bb);
201738fd1498Szrj 		}
201838fd1498Szrj 	    }
201938fd1498Szrj 	}
202038fd1498Szrj     }
202138fd1498Szrj }
202238fd1498Szrj 
202338fd1498Szrj /* This function checks the destination block of a "crossing jump" to
202438fd1498Szrj    see if it has any crossing predecessors that begin with a code label
202538fd1498Szrj    and end with an unconditional jump.  If so, it returns that predecessor
202638fd1498Szrj    block.  (This is to avoid creating lots of new basic blocks that all
202738fd1498Szrj    contain unconditional jumps to the same destination).  */
202838fd1498Szrj 
202938fd1498Szrj static basic_block
find_jump_block(basic_block jump_dest)203038fd1498Szrj find_jump_block (basic_block jump_dest)
203138fd1498Szrj {
203238fd1498Szrj   basic_block source_bb = NULL;
203338fd1498Szrj   edge e;
203438fd1498Szrj   rtx_insn *insn;
203538fd1498Szrj   edge_iterator ei;
203638fd1498Szrj 
203738fd1498Szrj   FOR_EACH_EDGE (e, ei, jump_dest->preds)
203838fd1498Szrj     if (e->flags & EDGE_CROSSING)
203938fd1498Szrj       {
204038fd1498Szrj 	basic_block src = e->src;
204138fd1498Szrj 
204238fd1498Szrj 	/* Check each predecessor to see if it has a label, and contains
204338fd1498Szrj 	   only one executable instruction, which is an unconditional jump.
204438fd1498Szrj 	   If so, we can use it.  */
204538fd1498Szrj 
204638fd1498Szrj 	if (LABEL_P (BB_HEAD (src)))
204738fd1498Szrj 	  for (insn = BB_HEAD (src);
204838fd1498Szrj 	       !INSN_P (insn) && insn != NEXT_INSN (BB_END (src));
204938fd1498Szrj 	       insn = NEXT_INSN (insn))
205038fd1498Szrj 	    {
205138fd1498Szrj 	      if (INSN_P (insn)
205238fd1498Szrj 		  && insn == BB_END (src)
205338fd1498Szrj 		  && JUMP_P (insn)
205438fd1498Szrj 		  && !any_condjump_p (insn))
205538fd1498Szrj 		{
205638fd1498Szrj 		  source_bb = src;
205738fd1498Szrj 		  break;
205838fd1498Szrj 		}
205938fd1498Szrj 	    }
206038fd1498Szrj 
206138fd1498Szrj 	if (source_bb)
206238fd1498Szrj 	  break;
206338fd1498Szrj       }
206438fd1498Szrj 
206538fd1498Szrj   return source_bb;
206638fd1498Szrj }
206738fd1498Szrj 
206838fd1498Szrj /* Find all BB's with conditional jumps that are crossing edges;
206938fd1498Szrj    insert a new bb and make the conditional jump branch to the new
207038fd1498Szrj    bb instead (make the new bb same color so conditional branch won't
207138fd1498Szrj    be a 'crossing' edge).  Insert an unconditional jump from the
207238fd1498Szrj    new bb to the original destination of the conditional jump.  */
207338fd1498Szrj 
207438fd1498Szrj static void
fix_crossing_conditional_branches(void)207538fd1498Szrj fix_crossing_conditional_branches (void)
207638fd1498Szrj {
207738fd1498Szrj   basic_block cur_bb;
207838fd1498Szrj   basic_block new_bb;
207938fd1498Szrj   basic_block dest;
208038fd1498Szrj   edge succ1;
208138fd1498Szrj   edge succ2;
208238fd1498Szrj   edge crossing_edge;
208338fd1498Szrj   edge new_edge;
208438fd1498Szrj   rtx set_src;
208538fd1498Szrj   rtx old_label = NULL_RTX;
208638fd1498Szrj   rtx_code_label *new_label;
208738fd1498Szrj 
208838fd1498Szrj   FOR_EACH_BB_FN (cur_bb, cfun)
208938fd1498Szrj     {
209038fd1498Szrj       crossing_edge = NULL;
209138fd1498Szrj       if (EDGE_COUNT (cur_bb->succs) > 0)
209238fd1498Szrj 	succ1 = EDGE_SUCC (cur_bb, 0);
209338fd1498Szrj       else
209438fd1498Szrj 	succ1 = NULL;
209538fd1498Szrj 
209638fd1498Szrj       if (EDGE_COUNT (cur_bb->succs) > 1)
209738fd1498Szrj 	succ2 = EDGE_SUCC (cur_bb, 1);
209838fd1498Szrj       else
209938fd1498Szrj 	succ2 = NULL;
210038fd1498Szrj 
210138fd1498Szrj       /* We already took care of fall-through edges, so only one successor
210238fd1498Szrj 	 can be a crossing edge.  */
210338fd1498Szrj 
210438fd1498Szrj       if (succ1 && (succ1->flags & EDGE_CROSSING))
210538fd1498Szrj 	crossing_edge = succ1;
210638fd1498Szrj       else if (succ2 && (succ2->flags & EDGE_CROSSING))
210738fd1498Szrj 	crossing_edge = succ2;
210838fd1498Szrj 
210938fd1498Szrj       if (crossing_edge)
211038fd1498Szrj 	{
211138fd1498Szrj 	  rtx_insn *old_jump = BB_END (cur_bb);
211238fd1498Szrj 
211338fd1498Szrj 	  /* Check to make sure the jump instruction is a
211438fd1498Szrj 	     conditional jump.  */
211538fd1498Szrj 
211638fd1498Szrj 	  set_src = NULL_RTX;
211738fd1498Szrj 
211838fd1498Szrj 	  if (any_condjump_p (old_jump))
211938fd1498Szrj 	    {
212038fd1498Szrj 	      if (GET_CODE (PATTERN (old_jump)) == SET)
212138fd1498Szrj 		set_src = SET_SRC (PATTERN (old_jump));
212238fd1498Szrj 	      else if (GET_CODE (PATTERN (old_jump)) == PARALLEL)
212338fd1498Szrj 		{
212438fd1498Szrj 		  set_src = XVECEXP (PATTERN (old_jump), 0,0);
212538fd1498Szrj 		  if (GET_CODE (set_src) == SET)
212638fd1498Szrj 		    set_src = SET_SRC (set_src);
212738fd1498Szrj 		  else
212838fd1498Szrj 		    set_src = NULL_RTX;
212938fd1498Szrj 		}
213038fd1498Szrj 	    }
213138fd1498Szrj 
213238fd1498Szrj 	  if (set_src && (GET_CODE (set_src) == IF_THEN_ELSE))
213338fd1498Szrj 	    {
213438fd1498Szrj 	      rtx_jump_insn *old_jump_insn =
213538fd1498Szrj 			as_a <rtx_jump_insn *> (old_jump);
213638fd1498Szrj 
213738fd1498Szrj 	      if (GET_CODE (XEXP (set_src, 1)) == PC)
213838fd1498Szrj 		old_label = XEXP (set_src, 2);
213938fd1498Szrj 	      else if (GET_CODE (XEXP (set_src, 2)) == PC)
214038fd1498Szrj 		old_label = XEXP (set_src, 1);
214138fd1498Szrj 
214238fd1498Szrj 	      /* Check to see if new bb for jumping to that dest has
214338fd1498Szrj 		 already been created; if so, use it; if not, create
214438fd1498Szrj 		 a new one.  */
214538fd1498Szrj 
214638fd1498Szrj 	      new_bb = find_jump_block (crossing_edge->dest);
214738fd1498Szrj 
214838fd1498Szrj 	      if (new_bb)
214938fd1498Szrj 		new_label = block_label (new_bb);
215038fd1498Szrj 	      else
215138fd1498Szrj 		{
215238fd1498Szrj 		  basic_block last_bb;
215338fd1498Szrj 		  rtx_code_label *old_jump_target;
215438fd1498Szrj 		  rtx_jump_insn *new_jump;
215538fd1498Szrj 
215638fd1498Szrj 		  /* Create new basic block to be dest for
215738fd1498Szrj 		     conditional jump.  */
215838fd1498Szrj 
215938fd1498Szrj 		  /* Put appropriate instructions in new bb.  */
216038fd1498Szrj 
216138fd1498Szrj 		  new_label = gen_label_rtx ();
216238fd1498Szrj 		  emit_label (new_label);
216338fd1498Szrj 
216438fd1498Szrj 		  gcc_assert (GET_CODE (old_label) == LABEL_REF);
216538fd1498Szrj 		  old_jump_target = old_jump_insn->jump_target ();
216638fd1498Szrj 		  new_jump = as_a <rtx_jump_insn *>
216738fd1498Szrj 		    (emit_jump_insn (targetm.gen_jump (old_jump_target)));
216838fd1498Szrj 		  new_jump->set_jump_target (old_jump_target);
216938fd1498Szrj 
217038fd1498Szrj 		  last_bb = EXIT_BLOCK_PTR_FOR_FN (cfun)->prev_bb;
217138fd1498Szrj 		  new_bb = create_basic_block (new_label, new_jump, last_bb);
217238fd1498Szrj 		  new_bb->aux = last_bb->aux;
217338fd1498Szrj 		  last_bb->aux = new_bb;
217438fd1498Szrj 
217538fd1498Szrj 		  emit_barrier_after_bb (new_bb);
217638fd1498Szrj 
217738fd1498Szrj 		  /* Make sure new bb is in same partition as source
217838fd1498Szrj 		     of conditional branch.  */
217938fd1498Szrj 		  BB_COPY_PARTITION (new_bb, cur_bb);
218038fd1498Szrj 		}
218138fd1498Szrj 
218238fd1498Szrj 	      /* Make old jump branch to new bb.  */
218338fd1498Szrj 
218438fd1498Szrj 	      redirect_jump (old_jump_insn, new_label, 0);
218538fd1498Szrj 
218638fd1498Szrj 	      /* Remove crossing_edge as predecessor of 'dest'.  */
218738fd1498Szrj 
218838fd1498Szrj 	      dest = crossing_edge->dest;
218938fd1498Szrj 
219038fd1498Szrj 	      redirect_edge_succ (crossing_edge, new_bb);
219138fd1498Szrj 
219238fd1498Szrj 	      /* Make a new edge from new_bb to old dest; new edge
219338fd1498Szrj 		 will be a successor for new_bb and a predecessor
219438fd1498Szrj 		 for 'dest'.  */
219538fd1498Szrj 
219638fd1498Szrj 	      if (EDGE_COUNT (new_bb->succs) == 0)
219738fd1498Szrj 		new_edge = make_single_succ_edge (new_bb, dest, 0);
219838fd1498Szrj 	      else
219938fd1498Szrj 		new_edge = EDGE_SUCC (new_bb, 0);
220038fd1498Szrj 
220138fd1498Szrj 	      crossing_edge->flags &= ~EDGE_CROSSING;
220238fd1498Szrj 	      new_edge->flags |= EDGE_CROSSING;
220338fd1498Szrj 	    }
220438fd1498Szrj 	}
220538fd1498Szrj     }
220638fd1498Szrj }
220738fd1498Szrj 
220838fd1498Szrj /* Find any unconditional branches that cross between hot and cold
220938fd1498Szrj    sections.  Convert them into indirect jumps instead.  */
221038fd1498Szrj 
221138fd1498Szrj static void
fix_crossing_unconditional_branches(void)221238fd1498Szrj fix_crossing_unconditional_branches (void)
221338fd1498Szrj {
221438fd1498Szrj   basic_block cur_bb;
221538fd1498Szrj   rtx_insn *last_insn;
221638fd1498Szrj   rtx label;
221738fd1498Szrj   rtx label_addr;
221838fd1498Szrj   rtx_insn *indirect_jump_sequence;
221938fd1498Szrj   rtx_insn *jump_insn = NULL;
222038fd1498Szrj   rtx new_reg;
222138fd1498Szrj   rtx_insn *cur_insn;
222238fd1498Szrj   edge succ;
222338fd1498Szrj 
222438fd1498Szrj   FOR_EACH_BB_FN (cur_bb, cfun)
222538fd1498Szrj     {
222638fd1498Szrj       last_insn = BB_END (cur_bb);
222738fd1498Szrj 
222838fd1498Szrj       if (EDGE_COUNT (cur_bb->succs) < 1)
222938fd1498Szrj 	continue;
223038fd1498Szrj 
223138fd1498Szrj       succ = EDGE_SUCC (cur_bb, 0);
223238fd1498Szrj 
223338fd1498Szrj       /* Check to see if bb ends in a crossing (unconditional) jump.  At
223438fd1498Szrj 	 this point, no crossing jumps should be conditional.  */
223538fd1498Szrj 
223638fd1498Szrj       if (JUMP_P (last_insn)
223738fd1498Szrj 	  && (succ->flags & EDGE_CROSSING))
223838fd1498Szrj 	{
223938fd1498Szrj 	  gcc_assert (!any_condjump_p (last_insn));
224038fd1498Szrj 
224138fd1498Szrj 	  /* Make sure the jump is not already an indirect or table jump.  */
224238fd1498Szrj 
224338fd1498Szrj 	  if (!computed_jump_p (last_insn)
224438fd1498Szrj 	      && !tablejump_p (last_insn, NULL, NULL))
224538fd1498Szrj 	    {
224638fd1498Szrj 	      /* We have found a "crossing" unconditional branch.  Now
224738fd1498Szrj 		 we must convert it to an indirect jump.  First create
224838fd1498Szrj 		 reference of label, as target for jump.  */
224938fd1498Szrj 
225038fd1498Szrj 	      label = JUMP_LABEL (last_insn);
225138fd1498Szrj 	      label_addr = gen_rtx_LABEL_REF (Pmode, label);
225238fd1498Szrj 	      LABEL_NUSES (label) += 1;
225338fd1498Szrj 
225438fd1498Szrj 	      /* Get a register to use for the indirect jump.  */
225538fd1498Szrj 
225638fd1498Szrj 	      new_reg = gen_reg_rtx (Pmode);
225738fd1498Szrj 
225838fd1498Szrj 	      /* Generate indirect the jump sequence.  */
225938fd1498Szrj 
226038fd1498Szrj 	      start_sequence ();
226138fd1498Szrj 	      emit_move_insn (new_reg, label_addr);
226238fd1498Szrj 	      emit_indirect_jump (new_reg);
226338fd1498Szrj 	      indirect_jump_sequence = get_insns ();
226438fd1498Szrj 	      end_sequence ();
226538fd1498Szrj 
226638fd1498Szrj 	      /* Make sure every instruction in the new jump sequence has
226738fd1498Szrj 		 its basic block set to be cur_bb.  */
226838fd1498Szrj 
226938fd1498Szrj 	      for (cur_insn = indirect_jump_sequence; cur_insn;
227038fd1498Szrj 		   cur_insn = NEXT_INSN (cur_insn))
227138fd1498Szrj 		{
227238fd1498Szrj 		  if (!BARRIER_P (cur_insn))
227338fd1498Szrj 		    BLOCK_FOR_INSN (cur_insn) = cur_bb;
227438fd1498Szrj 		  if (JUMP_P (cur_insn))
227538fd1498Szrj 		    jump_insn = cur_insn;
227638fd1498Szrj 		}
227738fd1498Szrj 
227838fd1498Szrj 	      /* Insert the new (indirect) jump sequence immediately before
227938fd1498Szrj 		 the unconditional jump, then delete the unconditional jump.  */
228038fd1498Szrj 
228138fd1498Szrj 	      emit_insn_before (indirect_jump_sequence, last_insn);
228238fd1498Szrj 	      delete_insn (last_insn);
228338fd1498Szrj 
228438fd1498Szrj 	      JUMP_LABEL (jump_insn) = label;
228538fd1498Szrj 	      LABEL_NUSES (label)++;
228638fd1498Szrj 
228738fd1498Szrj 	      /* Make BB_END for cur_bb be the jump instruction (NOT the
228838fd1498Szrj 		 barrier instruction at the end of the sequence...).  */
228938fd1498Szrj 
229038fd1498Szrj 	      BB_END (cur_bb) = jump_insn;
229138fd1498Szrj 	    }
229238fd1498Szrj 	}
229338fd1498Szrj     }
229438fd1498Szrj }
229538fd1498Szrj 
229638fd1498Szrj /* Update CROSSING_JUMP_P flags on all jump insns.  */
229738fd1498Szrj 
229838fd1498Szrj static void
update_crossing_jump_flags(void)229938fd1498Szrj update_crossing_jump_flags (void)
230038fd1498Szrj {
230138fd1498Szrj   basic_block bb;
230238fd1498Szrj   edge e;
230338fd1498Szrj   edge_iterator ei;
230438fd1498Szrj 
230538fd1498Szrj   FOR_EACH_BB_FN (bb, cfun)
230638fd1498Szrj     FOR_EACH_EDGE (e, ei, bb->succs)
230738fd1498Szrj       if (e->flags & EDGE_CROSSING)
230838fd1498Szrj 	{
230938fd1498Szrj 	  if (JUMP_P (BB_END (bb)))
231038fd1498Szrj 	    CROSSING_JUMP_P (BB_END (bb)) = 1;
231138fd1498Szrj 	  break;
231238fd1498Szrj 	}
231338fd1498Szrj }
231438fd1498Szrj 
231538fd1498Szrj /* Reorder basic blocks using the software trace cache (STC) algorithm.  */
231638fd1498Szrj 
231738fd1498Szrj static void
reorder_basic_blocks_software_trace_cache(void)231838fd1498Szrj reorder_basic_blocks_software_trace_cache (void)
231938fd1498Szrj {
232038fd1498Szrj   if (dump_file)
232138fd1498Szrj     fprintf (dump_file, "\nReordering with the STC algorithm.\n\n");
232238fd1498Szrj 
232338fd1498Szrj   int n_traces;
232438fd1498Szrj   int i;
232538fd1498Szrj   struct trace *traces;
232638fd1498Szrj 
232738fd1498Szrj   /* We are estimating the length of uncond jump insn only once since the code
232838fd1498Szrj      for getting the insn length always returns the minimal length now.  */
232938fd1498Szrj   if (uncond_jump_length == 0)
233038fd1498Szrj     uncond_jump_length = get_uncond_jump_length ();
233138fd1498Szrj 
233238fd1498Szrj   /* We need to know some information for each basic block.  */
233338fd1498Szrj   array_size = GET_ARRAY_SIZE (last_basic_block_for_fn (cfun));
233438fd1498Szrj   bbd = XNEWVEC (bbro_basic_block_data, array_size);
233538fd1498Szrj   for (i = 0; i < array_size; i++)
233638fd1498Szrj     {
233738fd1498Szrj       bbd[i].start_of_trace = -1;
233838fd1498Szrj       bbd[i].end_of_trace = -1;
233938fd1498Szrj       bbd[i].in_trace = -1;
234038fd1498Szrj       bbd[i].visited = 0;
234138fd1498Szrj       bbd[i].priority = -1;
234238fd1498Szrj       bbd[i].heap = NULL;
234338fd1498Szrj       bbd[i].node = NULL;
234438fd1498Szrj     }
234538fd1498Szrj 
234638fd1498Szrj   traces = XNEWVEC (struct trace, n_basic_blocks_for_fn (cfun));
234738fd1498Szrj   n_traces = 0;
234838fd1498Szrj   find_traces (&n_traces, traces);
234938fd1498Szrj   connect_traces (n_traces, traces);
235038fd1498Szrj   FREE (traces);
235138fd1498Szrj   FREE (bbd);
235238fd1498Szrj }
235338fd1498Szrj 
235438fd1498Szrj /* Return true if edge E1 is more desirable as a fallthrough edge than
235538fd1498Szrj    edge E2 is.  */
235638fd1498Szrj 
235738fd1498Szrj static bool
edge_order(edge e1,edge e2)235838fd1498Szrj edge_order (edge e1, edge e2)
235938fd1498Szrj {
236038fd1498Szrj   return e1->count () > e2->count ();
236138fd1498Szrj }
236238fd1498Szrj 
236338fd1498Szrj /* Reorder basic blocks using the "simple" algorithm.  This tries to
236438fd1498Szrj    maximize the dynamic number of branches that are fallthrough, without
236538fd1498Szrj    copying instructions.  The algorithm is greedy, looking at the most
236638fd1498Szrj    frequently executed branch first.  */
236738fd1498Szrj 
236838fd1498Szrj static void
reorder_basic_blocks_simple(void)236938fd1498Szrj reorder_basic_blocks_simple (void)
237038fd1498Szrj {
237138fd1498Szrj   if (dump_file)
237238fd1498Szrj     fprintf (dump_file, "\nReordering with the \"simple\" algorithm.\n\n");
237338fd1498Szrj 
237438fd1498Szrj   edge *edges = new edge[2 * n_basic_blocks_for_fn (cfun)];
237538fd1498Szrj 
237638fd1498Szrj   /* First, collect all edges that can be optimized by reordering blocks:
237738fd1498Szrj      simple jumps and conditional jumps, as well as the function entry edge.  */
237838fd1498Szrj 
237938fd1498Szrj   int n = 0;
238038fd1498Szrj   edges[n++] = EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0);
238138fd1498Szrj 
238238fd1498Szrj   basic_block bb;
238338fd1498Szrj   FOR_EACH_BB_FN (bb, cfun)
238438fd1498Szrj     {
238538fd1498Szrj       rtx_insn *end = BB_END (bb);
238638fd1498Szrj 
238738fd1498Szrj       if (computed_jump_p (end) || tablejump_p (end, NULL, NULL))
238838fd1498Szrj 	continue;
238938fd1498Szrj 
239038fd1498Szrj       /* We cannot optimize asm goto.  */
239138fd1498Szrj       if (JUMP_P (end) && extract_asm_operands (end))
239238fd1498Szrj 	continue;
239338fd1498Szrj 
239438fd1498Szrj       if (single_succ_p (bb))
239538fd1498Szrj 	edges[n++] = EDGE_SUCC (bb, 0);
239638fd1498Szrj       else if (any_condjump_p (end))
239738fd1498Szrj 	{
239838fd1498Szrj 	  edge e0 = EDGE_SUCC (bb, 0);
239938fd1498Szrj 	  edge e1 = EDGE_SUCC (bb, 1);
240038fd1498Szrj 	  /* When optimizing for size it is best to keep the original
240138fd1498Szrj 	     fallthrough edges.  */
240238fd1498Szrj 	  if (e1->flags & EDGE_FALLTHRU)
240338fd1498Szrj 	    std::swap (e0, e1);
240438fd1498Szrj 	  edges[n++] = e0;
240538fd1498Szrj 	  edges[n++] = e1;
240638fd1498Szrj 	}
240738fd1498Szrj     }
240838fd1498Szrj 
240938fd1498Szrj   /* Sort the edges, the most desirable first.  When optimizing for size
241038fd1498Szrj      all edges are equally desirable.  */
241138fd1498Szrj 
241238fd1498Szrj   if (optimize_function_for_speed_p (cfun))
241338fd1498Szrj     std::stable_sort (edges, edges + n, edge_order);
241438fd1498Szrj 
241538fd1498Szrj   /* Now decide which of those edges to make fallthrough edges.  We set
241638fd1498Szrj      BB_VISITED if a block already has a fallthrough successor assigned
241738fd1498Szrj      to it.  We make ->AUX of an endpoint point to the opposite endpoint
241838fd1498Szrj      of a sequence of blocks that fall through, and ->AUX will be NULL
241938fd1498Szrj      for a block that is in such a sequence but not an endpoint anymore.
242038fd1498Szrj 
242138fd1498Szrj      To start with, everything points to itself, nothing is assigned yet.  */
242238fd1498Szrj 
242338fd1498Szrj   FOR_ALL_BB_FN (bb, cfun)
242438fd1498Szrj     {
242538fd1498Szrj       bb->aux = bb;
242638fd1498Szrj       bb->flags &= ~BB_VISITED;
242738fd1498Szrj     }
242838fd1498Szrj 
242938fd1498Szrj   EXIT_BLOCK_PTR_FOR_FN (cfun)->aux = 0;
243038fd1498Szrj 
243138fd1498Szrj   /* Now for all edges, the most desirable first, see if that edge can
243238fd1498Szrj      connect two sequences.  If it can, update AUX and BB_VISITED; if it
243338fd1498Szrj      cannot, zero out the edge in the table.  */
243438fd1498Szrj 
243538fd1498Szrj   for (int j = 0; j < n; j++)
243638fd1498Szrj     {
243738fd1498Szrj       edge e = edges[j];
243838fd1498Szrj 
243938fd1498Szrj       basic_block tail_a = e->src;
244038fd1498Szrj       basic_block head_b = e->dest;
244138fd1498Szrj       basic_block head_a = (basic_block) tail_a->aux;
244238fd1498Szrj       basic_block tail_b = (basic_block) head_b->aux;
244338fd1498Szrj 
244438fd1498Szrj       /* An edge cannot connect two sequences if:
244538fd1498Szrj 	 - it crosses partitions;
244638fd1498Szrj 	 - its src is not a current endpoint;
244738fd1498Szrj 	 - its dest is not a current endpoint;
244838fd1498Szrj 	 - or, it would create a loop.  */
244938fd1498Szrj 
245038fd1498Szrj       if (e->flags & EDGE_CROSSING
245138fd1498Szrj 	  || tail_a->flags & BB_VISITED
245238fd1498Szrj 	  || !tail_b
245338fd1498Szrj 	  || (!(head_b->flags & BB_VISITED) && head_b != tail_b)
245438fd1498Szrj 	  || tail_a == tail_b)
245538fd1498Szrj 	{
245638fd1498Szrj 	  edges[j] = 0;
245738fd1498Szrj 	  continue;
245838fd1498Szrj 	}
245938fd1498Szrj 
246038fd1498Szrj       tail_a->aux = 0;
246138fd1498Szrj       head_b->aux = 0;
246238fd1498Szrj       head_a->aux = tail_b;
246338fd1498Szrj       tail_b->aux = head_a;
246438fd1498Szrj       tail_a->flags |= BB_VISITED;
246538fd1498Szrj     }
246638fd1498Szrj 
246738fd1498Szrj   /* Put the pieces together, in the same order that the start blocks of
246838fd1498Szrj      the sequences already had.  The hot/cold partitioning gives a little
246938fd1498Szrj      complication: as a first pass only do this for blocks in the same
247038fd1498Szrj      partition as the start block, and (if there is anything left to do)
247138fd1498Szrj      in a second pass handle the other partition.  */
247238fd1498Szrj 
247338fd1498Szrj   basic_block last_tail = (basic_block) ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux;
247438fd1498Szrj 
247538fd1498Szrj   int current_partition
247638fd1498Szrj     = BB_PARTITION (last_tail == ENTRY_BLOCK_PTR_FOR_FN (cfun)
247738fd1498Szrj 		    ? EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0)->dest
247838fd1498Szrj 		    : last_tail);
247938fd1498Szrj   bool need_another_pass = true;
248038fd1498Szrj 
248138fd1498Szrj   for (int pass = 0; pass < 2 && need_another_pass; pass++)
248238fd1498Szrj     {
248338fd1498Szrj       need_another_pass = false;
248438fd1498Szrj 
248538fd1498Szrj       FOR_EACH_BB_FN (bb, cfun)
248638fd1498Szrj 	if ((bb->flags & BB_VISITED && bb->aux) || bb->aux == bb)
248738fd1498Szrj 	  {
248838fd1498Szrj 	    if (BB_PARTITION (bb) != current_partition)
248938fd1498Szrj 	      {
249038fd1498Szrj 		need_another_pass = true;
249138fd1498Szrj 		continue;
249238fd1498Szrj 	      }
249338fd1498Szrj 
249438fd1498Szrj 	    last_tail->aux = bb;
249538fd1498Szrj 	    last_tail = (basic_block) bb->aux;
249638fd1498Szrj 	  }
249738fd1498Szrj 
249838fd1498Szrj       current_partition ^= BB_HOT_PARTITION | BB_COLD_PARTITION;
249938fd1498Szrj     }
250038fd1498Szrj 
250138fd1498Szrj   last_tail->aux = 0;
250238fd1498Szrj 
250338fd1498Szrj   /* Finally, link all the chosen fallthrough edges.  */
250438fd1498Szrj 
250538fd1498Szrj   for (int j = 0; j < n; j++)
250638fd1498Szrj     if (edges[j])
250738fd1498Szrj       edges[j]->src->aux = edges[j]->dest;
250838fd1498Szrj 
250938fd1498Szrj   delete[] edges;
251038fd1498Szrj 
251138fd1498Szrj   /* If the entry edge no longer falls through we have to make a new
251238fd1498Szrj      block so it can do so again.  */
251338fd1498Szrj 
251438fd1498Szrj   edge e = EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0);
251538fd1498Szrj   if (e->dest != ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux)
251638fd1498Szrj     {
251738fd1498Szrj       force_nonfallthru (e);
251838fd1498Szrj       e->src->aux = ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux;
251938fd1498Szrj     }
252038fd1498Szrj }
252138fd1498Szrj 
252238fd1498Szrj /* Reorder basic blocks.  The main entry point to this file.  */
252338fd1498Szrj 
252438fd1498Szrj static void
reorder_basic_blocks(void)252538fd1498Szrj reorder_basic_blocks (void)
252638fd1498Szrj {
252738fd1498Szrj   gcc_assert (current_ir_type () == IR_RTL_CFGLAYOUT);
252838fd1498Szrj 
252938fd1498Szrj   if (n_basic_blocks_for_fn (cfun) <= NUM_FIXED_BLOCKS + 1)
253038fd1498Szrj     return;
253138fd1498Szrj 
253238fd1498Szrj   set_edge_can_fallthru_flag ();
253338fd1498Szrj   mark_dfs_back_edges ();
253438fd1498Szrj 
253538fd1498Szrj   switch (flag_reorder_blocks_algorithm)
253638fd1498Szrj     {
253738fd1498Szrj     case REORDER_BLOCKS_ALGORITHM_SIMPLE:
253838fd1498Szrj       reorder_basic_blocks_simple ();
253938fd1498Szrj       break;
254038fd1498Szrj 
254138fd1498Szrj     case REORDER_BLOCKS_ALGORITHM_STC:
254238fd1498Szrj       reorder_basic_blocks_software_trace_cache ();
254338fd1498Szrj       break;
254438fd1498Szrj 
254538fd1498Szrj     default:
254638fd1498Szrj       gcc_unreachable ();
254738fd1498Szrj     }
254838fd1498Szrj 
254938fd1498Szrj   relink_block_chain (/*stay_in_cfglayout_mode=*/true);
255038fd1498Szrj 
255138fd1498Szrj   if (dump_file)
255238fd1498Szrj     {
255338fd1498Szrj       if (dump_flags & TDF_DETAILS)
255438fd1498Szrj 	dump_reg_info (dump_file);
255538fd1498Szrj       dump_flow_info (dump_file, dump_flags);
255638fd1498Szrj     }
255738fd1498Szrj 
255838fd1498Szrj   /* Signal that rtl_verify_flow_info_1 can now verify that there
255938fd1498Szrj      is at most one switch between hot/cold sections.  */
256038fd1498Szrj   crtl->bb_reorder_complete = true;
256138fd1498Szrj }
256238fd1498Szrj 
256338fd1498Szrj /* Determine which partition the first basic block in the function
256438fd1498Szrj    belongs to, then find the first basic block in the current function
256538fd1498Szrj    that belongs to a different section, and insert a
256638fd1498Szrj    NOTE_INSN_SWITCH_TEXT_SECTIONS note immediately before it in the
256738fd1498Szrj    instruction stream.  When writing out the assembly code,
256838fd1498Szrj    encountering this note will make the compiler switch between the
256938fd1498Szrj    hot and cold text sections.  */
257038fd1498Szrj 
257138fd1498Szrj void
insert_section_boundary_note(void)257238fd1498Szrj insert_section_boundary_note (void)
257338fd1498Szrj {
257438fd1498Szrj   basic_block bb;
257538fd1498Szrj   bool switched_sections = false;
257638fd1498Szrj   int current_partition = 0;
257738fd1498Szrj 
257838fd1498Szrj   if (!crtl->has_bb_partition)
257938fd1498Szrj     return;
258038fd1498Szrj 
258138fd1498Szrj   FOR_EACH_BB_FN (bb, cfun)
258238fd1498Szrj     {
258338fd1498Szrj       if (!current_partition)
258438fd1498Szrj 	current_partition = BB_PARTITION (bb);
258538fd1498Szrj       if (BB_PARTITION (bb) != current_partition)
258638fd1498Szrj 	{
258738fd1498Szrj 	  gcc_assert (!switched_sections);
258838fd1498Szrj           switched_sections = true;
258938fd1498Szrj           emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
259038fd1498Szrj           current_partition = BB_PARTITION (bb);
259138fd1498Szrj 	}
259238fd1498Szrj     }
259338fd1498Szrj 
259438fd1498Szrj   /* Make sure crtl->has_bb_partition matches reality even if bbpart finds
259538fd1498Szrj      some hot and some cold basic blocks, but later one of those kinds is
259638fd1498Szrj      optimized away.  */
259738fd1498Szrj   crtl->has_bb_partition = switched_sections;
259838fd1498Szrj }
259938fd1498Szrj 
260038fd1498Szrj namespace {
260138fd1498Szrj 
260238fd1498Szrj const pass_data pass_data_reorder_blocks =
260338fd1498Szrj {
260438fd1498Szrj   RTL_PASS, /* type */
260538fd1498Szrj   "bbro", /* name */
260638fd1498Szrj   OPTGROUP_NONE, /* optinfo_flags */
260738fd1498Szrj   TV_REORDER_BLOCKS, /* tv_id */
260838fd1498Szrj   0, /* properties_required */
260938fd1498Szrj   0, /* properties_provided */
261038fd1498Szrj   0, /* properties_destroyed */
261138fd1498Szrj   0, /* todo_flags_start */
261238fd1498Szrj   0, /* todo_flags_finish */
261338fd1498Szrj };
261438fd1498Szrj 
261538fd1498Szrj class pass_reorder_blocks : public rtl_opt_pass
261638fd1498Szrj {
261738fd1498Szrj public:
pass_reorder_blocks(gcc::context * ctxt)261838fd1498Szrj   pass_reorder_blocks (gcc::context *ctxt)
261938fd1498Szrj     : rtl_opt_pass (pass_data_reorder_blocks, ctxt)
262038fd1498Szrj   {}
262138fd1498Szrj 
262238fd1498Szrj   /* opt_pass methods: */
gate(function *)262338fd1498Szrj   virtual bool gate (function *)
262438fd1498Szrj     {
262538fd1498Szrj       if (targetm.cannot_modify_jumps_p ())
262638fd1498Szrj 	return false;
262738fd1498Szrj       return (optimize > 0
262838fd1498Szrj 	      && (flag_reorder_blocks || flag_reorder_blocks_and_partition));
262938fd1498Szrj     }
263038fd1498Szrj 
263138fd1498Szrj   virtual unsigned int execute (function *);
263238fd1498Szrj 
263338fd1498Szrj }; // class pass_reorder_blocks
263438fd1498Szrj 
263538fd1498Szrj unsigned int
execute(function * fun)263638fd1498Szrj pass_reorder_blocks::execute (function *fun)
263738fd1498Szrj {
263838fd1498Szrj   basic_block bb;
263938fd1498Szrj 
264038fd1498Szrj   /* Last attempt to optimize CFG, as scheduling, peepholing and insn
264138fd1498Szrj      splitting possibly introduced more crossjumping opportunities.  */
264238fd1498Szrj   cfg_layout_initialize (CLEANUP_EXPENSIVE);
264338fd1498Szrj 
264438fd1498Szrj   reorder_basic_blocks ();
264538fd1498Szrj   cleanup_cfg (CLEANUP_EXPENSIVE | CLEANUP_NO_PARTITIONING);
264638fd1498Szrj 
264738fd1498Szrj   FOR_EACH_BB_FN (bb, fun)
264838fd1498Szrj     if (bb->next_bb != EXIT_BLOCK_PTR_FOR_FN (fun))
264938fd1498Szrj       bb->aux = bb->next_bb;
265038fd1498Szrj   cfg_layout_finalize ();
265138fd1498Szrj 
265238fd1498Szrj   return 0;
265338fd1498Szrj }
265438fd1498Szrj 
265538fd1498Szrj } // anon namespace
265638fd1498Szrj 
265738fd1498Szrj rtl_opt_pass *
make_pass_reorder_blocks(gcc::context * ctxt)265838fd1498Szrj make_pass_reorder_blocks (gcc::context *ctxt)
265938fd1498Szrj {
266038fd1498Szrj   return new pass_reorder_blocks (ctxt);
266138fd1498Szrj }
266238fd1498Szrj 
266338fd1498Szrj /* Duplicate a block (that we already know ends in a computed jump) into its
266438fd1498Szrj    predecessors, where possible.  Return whether anything is changed.  */
266538fd1498Szrj static bool
maybe_duplicate_computed_goto(basic_block bb,int max_size)266638fd1498Szrj maybe_duplicate_computed_goto (basic_block bb, int max_size)
266738fd1498Szrj {
266838fd1498Szrj   if (single_pred_p (bb))
266938fd1498Szrj     return false;
267038fd1498Szrj 
267138fd1498Szrj   /* Make sure that the block is small enough.  */
267238fd1498Szrj   rtx_insn *insn;
267338fd1498Szrj   FOR_BB_INSNS (bb, insn)
267438fd1498Szrj     if (INSN_P (insn))
267538fd1498Szrj       {
267638fd1498Szrj 	max_size -= get_attr_min_length (insn);
267738fd1498Szrj 	if (max_size < 0)
267838fd1498Szrj 	   return false;
267938fd1498Szrj       }
268038fd1498Szrj 
268138fd1498Szrj   bool changed = false;
268238fd1498Szrj   edge e;
268338fd1498Szrj   edge_iterator ei;
268438fd1498Szrj   for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); )
268538fd1498Szrj     {
268638fd1498Szrj       basic_block pred = e->src;
268738fd1498Szrj 
268838fd1498Szrj       /* Do not duplicate BB into PRED if that is the last predecessor, or if
268938fd1498Szrj 	 we cannot merge a copy of BB with PRED.  */
269038fd1498Szrj       if (single_pred_p (bb)
269138fd1498Szrj 	  || !single_succ_p (pred)
269238fd1498Szrj 	  || e->flags & EDGE_COMPLEX
269338fd1498Szrj 	  || pred->index < NUM_FIXED_BLOCKS
269438fd1498Szrj 	  || (JUMP_P (BB_END (pred)) && !simplejump_p (BB_END (pred)))
269538fd1498Szrj 	  || (JUMP_P (BB_END (pred)) && CROSSING_JUMP_P (BB_END (pred))))
269638fd1498Szrj 	{
269738fd1498Szrj 	  ei_next (&ei);
269838fd1498Szrj 	  continue;
269938fd1498Szrj 	}
270038fd1498Szrj 
270138fd1498Szrj       if (dump_file)
270238fd1498Szrj 	fprintf (dump_file, "Duplicating computed goto bb %d into bb %d\n",
270338fd1498Szrj 		 bb->index, e->src->index);
270438fd1498Szrj 
270538fd1498Szrj       /* Remember if PRED can be duplicated; if so, the copy of BB merged
270638fd1498Szrj 	 with PRED can be duplicated as well.  */
270738fd1498Szrj       bool can_dup_more = can_duplicate_block_p (pred);
270838fd1498Szrj 
270938fd1498Szrj       /* Make a copy of BB, merge it into PRED.  */
271038fd1498Szrj       basic_block copy = duplicate_block (bb, e, NULL);
271138fd1498Szrj       emit_barrier_after_bb (copy);
271238fd1498Szrj       reorder_insns_nobb (BB_HEAD (copy), BB_END (copy), BB_END (pred));
271338fd1498Szrj       merge_blocks (pred, copy);
271438fd1498Szrj 
271538fd1498Szrj       changed = true;
271638fd1498Szrj 
271738fd1498Szrj       /* Try to merge the resulting merged PRED into further predecessors.  */
271838fd1498Szrj       if (can_dup_more)
271938fd1498Szrj 	maybe_duplicate_computed_goto (pred, max_size);
272038fd1498Szrj     }
272138fd1498Szrj 
272238fd1498Szrj   return changed;
272338fd1498Szrj }
272438fd1498Szrj 
272538fd1498Szrj /* Duplicate the blocks containing computed gotos.  This basically unfactors
272638fd1498Szrj    computed gotos that were factored early on in the compilation process to
272738fd1498Szrj    speed up edge based data flow.  We used to not unfactor them again, which
272838fd1498Szrj    can seriously pessimize code with many computed jumps in the source code,
272938fd1498Szrj    such as interpreters.  See e.g. PR15242.  */
273038fd1498Szrj static void
duplicate_computed_gotos(function * fun)273138fd1498Szrj duplicate_computed_gotos (function *fun)
273238fd1498Szrj {
273338fd1498Szrj   /* We are estimating the length of uncond jump insn only once
273438fd1498Szrj      since the code for getting the insn length always returns
273538fd1498Szrj      the minimal length now.  */
273638fd1498Szrj   if (uncond_jump_length == 0)
273738fd1498Szrj     uncond_jump_length = get_uncond_jump_length ();
273838fd1498Szrj 
273938fd1498Szrj   /* Never copy a block larger than this.  */
274038fd1498Szrj   int max_size
274138fd1498Szrj     = uncond_jump_length * PARAM_VALUE (PARAM_MAX_GOTO_DUPLICATION_INSNS);
274238fd1498Szrj 
274338fd1498Szrj   bool changed = false;
274438fd1498Szrj 
274538fd1498Szrj   /* Try to duplicate all blocks that end in a computed jump and that
274638fd1498Szrj      can be duplicated at all.  */
274738fd1498Szrj   basic_block bb;
274838fd1498Szrj   FOR_EACH_BB_FN (bb, fun)
274938fd1498Szrj     if (computed_jump_p (BB_END (bb)) && can_duplicate_block_p (bb))
275038fd1498Szrj       changed |= maybe_duplicate_computed_goto (bb, max_size);
275138fd1498Szrj 
275238fd1498Szrj   /* Duplicating blocks will redirect edges and may cause hot blocks
275338fd1498Szrj     previously reached by both hot and cold blocks to become dominated
275438fd1498Szrj     only by cold blocks.  */
275538fd1498Szrj   if (changed)
275638fd1498Szrj     fixup_partitions ();
275738fd1498Szrj }
275838fd1498Szrj 
275938fd1498Szrj namespace {
276038fd1498Szrj 
276138fd1498Szrj const pass_data pass_data_duplicate_computed_gotos =
276238fd1498Szrj {
276338fd1498Szrj   RTL_PASS, /* type */
276438fd1498Szrj   "compgotos", /* name */
276538fd1498Szrj   OPTGROUP_NONE, /* optinfo_flags */
276638fd1498Szrj   TV_REORDER_BLOCKS, /* tv_id */
276738fd1498Szrj   0, /* properties_required */
276838fd1498Szrj   0, /* properties_provided */
276938fd1498Szrj   0, /* properties_destroyed */
277038fd1498Szrj   0, /* todo_flags_start */
277138fd1498Szrj   0, /* todo_flags_finish */
277238fd1498Szrj };
277338fd1498Szrj 
277438fd1498Szrj class pass_duplicate_computed_gotos : public rtl_opt_pass
277538fd1498Szrj {
277638fd1498Szrj public:
pass_duplicate_computed_gotos(gcc::context * ctxt)277738fd1498Szrj   pass_duplicate_computed_gotos (gcc::context *ctxt)
277838fd1498Szrj     : rtl_opt_pass (pass_data_duplicate_computed_gotos, ctxt)
277938fd1498Szrj   {}
278038fd1498Szrj 
278138fd1498Szrj   /* opt_pass methods: */
278238fd1498Szrj   virtual bool gate (function *);
278338fd1498Szrj   virtual unsigned int execute (function *);
278438fd1498Szrj 
278538fd1498Szrj }; // class pass_duplicate_computed_gotos
278638fd1498Szrj 
278738fd1498Szrj bool
gate(function * fun)278838fd1498Szrj pass_duplicate_computed_gotos::gate (function *fun)
278938fd1498Szrj {
279038fd1498Szrj   if (targetm.cannot_modify_jumps_p ())
279138fd1498Szrj     return false;
279238fd1498Szrj   return (optimize > 0
279338fd1498Szrj 	  && flag_expensive_optimizations
279438fd1498Szrj 	  && ! optimize_function_for_size_p (fun));
279538fd1498Szrj }
279638fd1498Szrj 
279738fd1498Szrj unsigned int
execute(function * fun)279838fd1498Szrj pass_duplicate_computed_gotos::execute (function *fun)
279938fd1498Szrj {
280038fd1498Szrj   duplicate_computed_gotos (fun);
280138fd1498Szrj 
280238fd1498Szrj   return 0;
280338fd1498Szrj }
280438fd1498Szrj 
280538fd1498Szrj } // anon namespace
280638fd1498Szrj 
280738fd1498Szrj rtl_opt_pass *
make_pass_duplicate_computed_gotos(gcc::context * ctxt)280838fd1498Szrj make_pass_duplicate_computed_gotos (gcc::context *ctxt)
280938fd1498Szrj {
281038fd1498Szrj   return new pass_duplicate_computed_gotos (ctxt);
281138fd1498Szrj }
281238fd1498Szrj 
281338fd1498Szrj /* This function is the main 'entrance' for the optimization that
281438fd1498Szrj    partitions hot and cold basic blocks into separate sections of the
281538fd1498Szrj    .o file (to improve performance and cache locality).  Ideally it
281638fd1498Szrj    would be called after all optimizations that rearrange the CFG have
281738fd1498Szrj    been called.  However part of this optimization may introduce new
281838fd1498Szrj    register usage, so it must be called before register allocation has
281938fd1498Szrj    occurred.  This means that this optimization is actually called
282038fd1498Szrj    well before the optimization that reorders basic blocks (see
282138fd1498Szrj    function above).
282238fd1498Szrj 
282338fd1498Szrj    This optimization checks the feedback information to determine
282438fd1498Szrj    which basic blocks are hot/cold, updates flags on the basic blocks
282538fd1498Szrj    to indicate which section they belong in.  This information is
282638fd1498Szrj    later used for writing out sections in the .o file.  Because hot
282738fd1498Szrj    and cold sections can be arbitrarily large (within the bounds of
282838fd1498Szrj    memory), far beyond the size of a single function, it is necessary
282938fd1498Szrj    to fix up all edges that cross section boundaries, to make sure the
283038fd1498Szrj    instructions used can actually span the required distance.  The
283138fd1498Szrj    fixes are described below.
283238fd1498Szrj 
283338fd1498Szrj    Fall-through edges must be changed into jumps; it is not safe or
283438fd1498Szrj    legal to fall through across a section boundary.  Whenever a
283538fd1498Szrj    fall-through edge crossing a section boundary is encountered, a new
283638fd1498Szrj    basic block is inserted (in the same section as the fall-through
283738fd1498Szrj    source), and the fall through edge is redirected to the new basic
283838fd1498Szrj    block.  The new basic block contains an unconditional jump to the
283938fd1498Szrj    original fall-through target.  (If the unconditional jump is
284038fd1498Szrj    insufficient to cross section boundaries, that is dealt with a
284138fd1498Szrj    little later, see below).
284238fd1498Szrj 
284338fd1498Szrj    In order to deal with architectures that have short conditional
284438fd1498Szrj    branches (which cannot span all of memory) we take any conditional
284538fd1498Szrj    jump that attempts to cross a section boundary and add a level of
284638fd1498Szrj    indirection: it becomes a conditional jump to a new basic block, in
284738fd1498Szrj    the same section.  The new basic block contains an unconditional
284838fd1498Szrj    jump to the original target, in the other section.
284938fd1498Szrj 
285038fd1498Szrj    For those architectures whose unconditional branch is also
285138fd1498Szrj    incapable of reaching all of memory, those unconditional jumps are
285238fd1498Szrj    converted into indirect jumps, through a register.
285338fd1498Szrj 
285438fd1498Szrj    IMPORTANT NOTE: This optimization causes some messy interactions
285538fd1498Szrj    with the cfg cleanup optimizations; those optimizations want to
285638fd1498Szrj    merge blocks wherever possible, and to collapse indirect jump
285738fd1498Szrj    sequences (change "A jumps to B jumps to C" directly into "A jumps
285838fd1498Szrj    to C").  Those optimizations can undo the jump fixes that
285938fd1498Szrj    partitioning is required to make (see above), in order to ensure
286038fd1498Szrj    that jumps attempting to cross section boundaries are really able
286138fd1498Szrj    to cover whatever distance the jump requires (on many architectures
286238fd1498Szrj    conditional or unconditional jumps are not able to reach all of
286338fd1498Szrj    memory).  Therefore tests have to be inserted into each such
286438fd1498Szrj    optimization to make sure that it does not undo stuff necessary to
286538fd1498Szrj    cross partition boundaries.  This would be much less of a problem
286638fd1498Szrj    if we could perform this optimization later in the compilation, but
286738fd1498Szrj    unfortunately the fact that we may need to create indirect jumps
286838fd1498Szrj    (through registers) requires that this optimization be performed
286938fd1498Szrj    before register allocation.
287038fd1498Szrj 
287138fd1498Szrj    Hot and cold basic blocks are partitioned and put in separate
287238fd1498Szrj    sections of the .o file, to reduce paging and improve cache
287338fd1498Szrj    performance (hopefully).  This can result in bits of code from the
287438fd1498Szrj    same function being widely separated in the .o file.  However this
287538fd1498Szrj    is not obvious to the current bb structure.  Therefore we must take
287638fd1498Szrj    care to ensure that: 1). There are no fall_thru edges that cross
287738fd1498Szrj    between sections; 2). For those architectures which have "short"
287838fd1498Szrj    conditional branches, all conditional branches that attempt to
287938fd1498Szrj    cross between sections are converted to unconditional branches;
288038fd1498Szrj    and, 3). For those architectures which have "short" unconditional
288138fd1498Szrj    branches, all unconditional branches that attempt to cross between
288238fd1498Szrj    sections are converted to indirect jumps.
288338fd1498Szrj 
288438fd1498Szrj    The code for fixing up fall_thru edges that cross between hot and
288538fd1498Szrj    cold basic blocks does so by creating new basic blocks containing
288638fd1498Szrj    unconditional branches to the appropriate label in the "other"
288738fd1498Szrj    section.  The new basic block is then put in the same (hot or cold)
288838fd1498Szrj    section as the original conditional branch, and the fall_thru edge
288938fd1498Szrj    is modified to fall into the new basic block instead.  By adding
289038fd1498Szrj    this level of indirection we end up with only unconditional branches
289138fd1498Szrj    crossing between hot and cold sections.
289238fd1498Szrj 
289338fd1498Szrj    Conditional branches are dealt with by adding a level of indirection.
289438fd1498Szrj    A new basic block is added in the same (hot/cold) section as the
289538fd1498Szrj    conditional branch, and the conditional branch is retargeted to the
289638fd1498Szrj    new basic block.  The new basic block contains an unconditional branch
289738fd1498Szrj    to the original target of the conditional branch (in the other section).
289838fd1498Szrj 
289938fd1498Szrj    Unconditional branches are dealt with by converting them into
290038fd1498Szrj    indirect jumps.  */
290138fd1498Szrj 
290238fd1498Szrj namespace {
290338fd1498Szrj 
290438fd1498Szrj const pass_data pass_data_partition_blocks =
290538fd1498Szrj {
290638fd1498Szrj   RTL_PASS, /* type */
290738fd1498Szrj   "bbpart", /* name */
290838fd1498Szrj   OPTGROUP_NONE, /* optinfo_flags */
290938fd1498Szrj   TV_REORDER_BLOCKS, /* tv_id */
291038fd1498Szrj   PROP_cfglayout, /* properties_required */
291138fd1498Szrj   0, /* properties_provided */
291238fd1498Szrj   0, /* properties_destroyed */
291338fd1498Szrj   0, /* todo_flags_start */
291438fd1498Szrj   0, /* todo_flags_finish */
291538fd1498Szrj };
291638fd1498Szrj 
291738fd1498Szrj class pass_partition_blocks : public rtl_opt_pass
291838fd1498Szrj {
291938fd1498Szrj public:
pass_partition_blocks(gcc::context * ctxt)292038fd1498Szrj   pass_partition_blocks (gcc::context *ctxt)
292138fd1498Szrj     : rtl_opt_pass (pass_data_partition_blocks, ctxt)
292238fd1498Szrj   {}
292338fd1498Szrj 
292438fd1498Szrj   /* opt_pass methods: */
292538fd1498Szrj   virtual bool gate (function *);
292638fd1498Szrj   virtual unsigned int execute (function *);
292738fd1498Szrj 
292838fd1498Szrj }; // class pass_partition_blocks
292938fd1498Szrj 
293038fd1498Szrj bool
gate(function * fun)293138fd1498Szrj pass_partition_blocks::gate (function *fun)
293238fd1498Szrj {
293338fd1498Szrj   /* The optimization to partition hot/cold basic blocks into separate
293438fd1498Szrj      sections of the .o file does not work well with linkonce or with
2935*58e805e6Szrj      user defined section attributes or with naked attribute.  Don't call
2936*58e805e6Szrj      it if either case arises.  */
293738fd1498Szrj   return (flag_reorder_blocks_and_partition
293838fd1498Szrj 	  && optimize
293938fd1498Szrj 	  /* See pass_reorder_blocks::gate.  We should not partition if
294038fd1498Szrj 	     we are going to omit the reordering.  */
294138fd1498Szrj 	  && optimize_function_for_speed_p (fun)
294238fd1498Szrj 	  && !DECL_COMDAT_GROUP (current_function_decl)
294338fd1498Szrj 	  && !lookup_attribute ("section", DECL_ATTRIBUTES (fun->decl))
2944*58e805e6Szrj 	  && !lookup_attribute ("naked", DECL_ATTRIBUTES (fun->decl))
294538fd1498Szrj 	  /* Workaround a bug in GDB where read_partial_die doesn't cope
294638fd1498Szrj 	     with DIEs with DW_AT_ranges, see PR81115.  */
294738fd1498Szrj 	  && !(in_lto_p && MAIN_NAME_P (DECL_NAME (fun->decl))));
294838fd1498Szrj }
294938fd1498Szrj 
295038fd1498Szrj unsigned
execute(function * fun)295138fd1498Szrj pass_partition_blocks::execute (function *fun)
295238fd1498Szrj {
295338fd1498Szrj   vec<edge> crossing_edges;
295438fd1498Szrj 
295538fd1498Szrj   if (n_basic_blocks_for_fn (fun) <= NUM_FIXED_BLOCKS + 1)
295638fd1498Szrj     return 0;
295738fd1498Szrj 
295838fd1498Szrj   df_set_flags (DF_DEFER_INSN_RESCAN);
295938fd1498Szrj 
296038fd1498Szrj   crossing_edges = find_rarely_executed_basic_blocks_and_crossing_edges ();
296138fd1498Szrj   if (!crossing_edges.exists ())
296238fd1498Szrj     /* Make sure to process deferred rescans and clear changeable df flags.  */
296338fd1498Szrj     return TODO_df_finish;
296438fd1498Szrj 
296538fd1498Szrj   crtl->has_bb_partition = true;
296638fd1498Szrj 
296738fd1498Szrj   /* Make sure the source of any crossing edge ends in a jump and the
296838fd1498Szrj      destination of any crossing edge has a label.  */
296938fd1498Szrj   add_labels_and_missing_jumps (crossing_edges);
297038fd1498Szrj 
297138fd1498Szrj   /* Convert all crossing fall_thru edges to non-crossing fall
297238fd1498Szrj      thrus to unconditional jumps (that jump to the original fall
297338fd1498Szrj      through dest).  */
297438fd1498Szrj   fix_up_fall_thru_edges ();
297538fd1498Szrj 
297638fd1498Szrj   /* If the architecture does not have conditional branches that can
297738fd1498Szrj      span all of memory, convert crossing conditional branches into
297838fd1498Szrj      crossing unconditional branches.  */
297938fd1498Szrj   if (!HAS_LONG_COND_BRANCH)
298038fd1498Szrj     fix_crossing_conditional_branches ();
298138fd1498Szrj 
298238fd1498Szrj   /* If the architecture does not have unconditional branches that
298338fd1498Szrj      can span all of memory, convert crossing unconditional branches
298438fd1498Szrj      into indirect jumps.  Since adding an indirect jump also adds
298538fd1498Szrj      a new register usage, update the register usage information as
298638fd1498Szrj      well.  */
298738fd1498Szrj   if (!HAS_LONG_UNCOND_BRANCH)
298838fd1498Szrj     fix_crossing_unconditional_branches ();
298938fd1498Szrj 
299038fd1498Szrj   update_crossing_jump_flags ();
299138fd1498Szrj 
299238fd1498Szrj   /* Clear bb->aux fields that the above routines were using.  */
299338fd1498Szrj   clear_aux_for_blocks ();
299438fd1498Szrj 
299538fd1498Szrj   crossing_edges.release ();
299638fd1498Szrj 
299738fd1498Szrj   /* ??? FIXME: DF generates the bb info for a block immediately.
299838fd1498Szrj      And by immediately, I mean *during* creation of the block.
299938fd1498Szrj 
300038fd1498Szrj 	#0  df_bb_refs_collect
300138fd1498Szrj 	#1  in df_bb_refs_record
300238fd1498Szrj 	#2  in create_basic_block_structure
300338fd1498Szrj 
300438fd1498Szrj      Which means that the bb_has_eh_pred test in df_bb_refs_collect
300538fd1498Szrj      will *always* fail, because no edges can have been added to the
300638fd1498Szrj      block yet.  Which of course means we don't add the right
300738fd1498Szrj      artificial refs, which means we fail df_verify (much) later.
300838fd1498Szrj 
300938fd1498Szrj      Cleanest solution would seem to make DF_DEFER_INSN_RESCAN imply
301038fd1498Szrj      that we also shouldn't grab data from the new blocks those new
301138fd1498Szrj      insns are in either.  In this way one can create the block, link
301238fd1498Szrj      it up properly, and have everything Just Work later, when deferred
301338fd1498Szrj      insns are processed.
301438fd1498Szrj 
301538fd1498Szrj      In the meantime, we have no other option but to throw away all
301638fd1498Szrj      of the DF data and recompute it all.  */
301738fd1498Szrj   if (fun->eh->lp_array)
301838fd1498Szrj     {
301938fd1498Szrj       df_finish_pass (true);
302038fd1498Szrj       df_scan_alloc (NULL);
302138fd1498Szrj       df_scan_blocks ();
302238fd1498Szrj       /* Not all post-landing pads use all of the EH_RETURN_DATA_REGNO
302338fd1498Szrj 	 data.  We blindly generated all of them when creating the new
302438fd1498Szrj 	 landing pad.  Delete those assignments we don't use.  */
302538fd1498Szrj       df_set_flags (DF_LR_RUN_DCE);
302638fd1498Szrj       df_analyze ();
302738fd1498Szrj     }
302838fd1498Szrj 
302938fd1498Szrj   /* Make sure to process deferred rescans and clear changeable df flags.  */
303038fd1498Szrj   return TODO_df_finish;
303138fd1498Szrj }
303238fd1498Szrj 
303338fd1498Szrj } // anon namespace
303438fd1498Szrj 
303538fd1498Szrj rtl_opt_pass *
make_pass_partition_blocks(gcc::context * ctxt)303638fd1498Szrj make_pass_partition_blocks (gcc::context *ctxt)
303738fd1498Szrj {
303838fd1498Szrj   return new pass_partition_blocks (ctxt);
303938fd1498Szrj }
3040